chessdevilai

This model is a fine-tuned version of EleutherAI/pythia-70m-deduped on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.1912 0.0100 62 1.2654
1.1714 0.0200 124 1.1780
1.0771 0.0301 186 1.1419
1.0829 0.0401 248 1.1046
1.0113 0.0501 310 1.0850
1.152 0.0601 372 1.0701
1.0895 0.0701 434 1.0544
0.9123 0.0802 496 1.0484
1.0489 0.0902 558 1.0214
1.0312 0.1002 620 1.0252
0.9756 0.1102 682 1.0020
1.0125 0.1202 744 0.9940
1.0581 0.1303 806 0.9862
1.0726 0.1403 868 0.9809
0.9963 0.1503 930 0.9830
0.9309 0.1603 992 0.9653
0.8858 0.1703 1054 0.9538
1.1137 0.1803 1116 0.9472
0.9024 0.1904 1178 0.9411
0.9812 0.2004 1240 0.9396
0.9916 0.2104 1302 0.9254
0.9509 0.2204 1364 0.9334
0.8848 0.2304 1426 0.9439
0.8302 0.2405 1488 0.9175
1.0111 0.2505 1550 0.9158
1.0273 0.2605 1612 0.9182
0.8968 0.2705 1674 0.9116
0.8892 0.2805 1736 0.9098
0.7539 0.2906 1798 0.8896
0.811 0.3006 1860 0.8968
0.928 0.3106 1922 0.8875
0.8163 0.3206 1984 0.8821
0.9202 0.3306 2046 0.8820
1.0208 0.3407 2108 0.8811
0.8297 0.3507 2170 0.8823
0.8213 0.3607 2232 0.8736
0.8324 0.3707 2294 0.8698
0.7721 0.3807 2356 0.8735
0.9504 0.3908 2418 0.8705
0.858 0.4008 2480 0.8620
0.8791 0.4108 2542 0.8540
0.8411 0.4208 2604 0.8606
0.8845 0.4308 2666 0.8496
0.7752 0.4409 2728 0.8462
0.8598 0.4509 2790 0.8481
0.7935 0.4609 2852 0.8412
0.7352 0.4709 2914 0.8392
0.8153 0.4809 2976 0.8426
0.7371 0.4910 3038 0.8332
0.7136 0.5010 3100 0.8300
0.9777 0.5110 3162 0.8294
0.8336 0.5210 3224 0.8306
0.7546 0.5310 3286 0.8234
0.8436 0.5410 3348 0.8237
0.9316 0.5511 3410 0.8224
0.6996 0.5611 3472 0.8191
0.7417 0.5711 3534 0.8146
0.8528 0.5811 3596 0.8110
0.6861 0.5911 3658 0.8095
0.8401 0.6012 3720 0.8096
0.7056 0.6112 3782 0.8080
0.8643 0.6212 3844 0.8004
0.7575 0.6312 3906 0.8018
0.8133 0.6412 3968 0.8008
0.8221 0.6513 4030 0.7940
0.8004 0.6613 4092 0.7948
0.7002 0.6713 4154 0.7984
0.8425 0.6813 4216 0.7892
0.6777 0.6913 4278 0.7876
0.9178 0.7014 4340 0.7865
0.787 0.7114 4402 0.7844
0.6979 0.7214 4464 0.7829
0.7954 0.7314 4526 0.7825
0.7937 0.7414 4588 0.7792
0.7849 0.7515 4650 0.7790
0.7108 0.7615 4712 0.7782
0.831 0.7715 4774 0.7768
0.8242 0.7815 4836 0.7741
0.7472 0.7915 4898 0.7731
0.8171 0.8016 4960 0.7732
0.7857 0.8116 5022 0.7702
0.7925 0.8216 5084 0.7707
0.7134 0.8316 5146 0.7680
0.8401 0.8416 5208 0.7686
0.6919 0.8516 5270 0.7679
0.7689 0.8617 5332 0.7658
0.7899 0.8717 5394 0.7645
0.8457 0.8817 5456 0.7639
0.7738 0.8917 5518 0.7635
0.7943 0.9017 5580 0.7628
0.756 0.9118 5642 0.7625
0.8021 0.9218 5704 0.7619
0.7325 0.9318 5766 0.7615
0.7312 0.9418 5828 0.7613
0.8255 0.9518 5890 0.7613
0.794 0.9619 5952 0.7610
0.7392 0.9719 6014 0.7609
0.841 0.9819 6076 0.7609
0.7018 0.9919 6138 0.7609

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
28
Safetensors
Model size
70.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Vasanth/chessdevilai

Finetuned
(119)
this model