gpt2_distily

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1280.4192

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
No log 0 0 13088.9219
2126.0 0.0323 2000 1936.6752
1836.0 0.0646 4000 1773.7856
1808.0 0.0970 6000 1547.3760
1546.0 0.1293 8000 1487.7600
1576.0 0.1616 10000 1449.2288
1548.0 0.1939 12000 1413.3920
1246.0 0.2263 14000 1387.0656
1516.0 0.2586 16000 1364.6768
1330.0 0.2909 18000 1346.6160
1284.0 0.3232 20000 1332.1040
1290.0 0.3556 22000 1320.7792
1390.0 0.3879 24000 1310.2496
1568.0 0.4202 26000 1302.9937
1404.0 0.4525 28000 1299.0112
1528.0 0.4848 30000 1293.5887
1263.0 0.5172 32000 1290.0032
1294.0 0.5495 34000 1287.8672
1355.0 0.5818 36000 1285.7808
1300.0 0.6141 38000 1283.5009
1368.0 0.6465 40000 1282.9136
1496.0 0.6788 42000 1281.6096
1502.0 0.7111 44000 1281.7408
1352.0 0.7434 46000 1280.9344
1418.0 0.7758 48000 1280.6288
1158.0 0.8081 50000 1280.5760
1534.0 0.8404 52000 1280.4000
1276.0 0.8727 54000 1280.4032
1184.0 0.9051 56000 1280.4160
1370.0 0.9374 58000 1280.4320
1210.0 0.9697 60000 1280.4192

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.3.0
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
13
Safetensors
Model size
124M params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for lapp0/gpt2_distily

Finetuned
(1265)
this model