ombarki345's picture
End of training
3134319 verified
|
raw
history blame
8.35 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: my_awesome_opus_books_model
    results: []

my_awesome_opus_books_model

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9121
  • Bleu: 0.0681
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 15 3.8537 0.0406 19.0
No log 2.0 30 3.3215 0.0355 19.0
No log 3.0 45 3.0488 0.0322 19.0
No log 4.0 60 2.8891 0.0327 19.0
No log 5.0 75 2.7809 0.0451 19.0
No log 6.0 90 2.7010 0.0489 19.0
No log 7.0 105 2.6242 0.0416 19.0
No log 8.0 120 2.5600 0.053 19.0
No log 9.0 135 2.5093 0.0512 19.0
No log 10.0 150 2.4712 0.0526 19.0
No log 11.0 165 2.4346 0.0854 19.0
No log 12.0 180 2.3966 0.0892 19.0
No log 13.0 195 2.3590 0.1197 19.0
No log 14.0 210 2.3271 0.1243 19.0
No log 15.0 225 2.2995 0.1243 19.0
No log 16.0 240 2.2742 0.1004 19.0
No log 17.0 255 2.2569 0.1073 19.0
No log 18.0 270 2.2399 0.1305 19.0
No log 19.0 285 2.2236 0.1288 19.0
No log 20.0 300 2.2085 0.1248 19.0
No log 21.0 315 2.1936 0.1153 19.0
No log 22.0 330 2.1801 0.093 19.0
No log 23.0 345 2.1685 0.1079 19.0
No log 24.0 360 2.1568 0.1079 19.0
No log 25.0 375 2.1464 0.0881 19.0
No log 26.0 390 2.1365 0.0881 19.0
No log 27.0 405 2.1264 0.0876 19.0
No log 28.0 420 2.1166 0.0858 19.0
No log 29.0 435 2.1079 0.0858 19.0
No log 30.0 450 2.1001 0.0863 19.0
No log 31.0 465 2.0919 0.0871 19.0
No log 32.0 480 2.0853 0.0833 19.0
No log 33.0 495 2.0781 0.0833 19.0
2.7093 34.0 510 2.0698 0.0833 19.0
2.7093 35.0 525 2.0632 0.0833 19.0
2.7093 36.0 540 2.0562 0.0828 19.0
2.7093 37.0 555 2.0514 0.0799 19.0
2.7093 38.0 570 2.0458 0.0761 19.0
2.7093 39.0 585 2.0400 0.0761 19.0
2.7093 40.0 600 2.0352 0.0799 19.0
2.7093 41.0 615 2.0297 0.0761 19.0
2.7093 42.0 630 2.0244 0.0748 19.0
2.7093 43.0 645 2.0200 0.075 19.0
2.7093 44.0 660 2.0155 0.0748 19.0
2.7093 45.0 675 2.0104 0.0748 19.0
2.7093 46.0 690 2.0053 0.075 19.0
2.7093 47.0 705 2.0012 0.075 19.0
2.7093 48.0 720 1.9966 0.075 19.0
2.7093 49.0 735 1.9923 0.075 19.0
2.7093 50.0 750 1.9890 0.075 19.0
2.7093 51.0 765 1.9856 0.0747 19.0
2.7093 52.0 780 1.9820 0.0747 19.0
2.7093 53.0 795 1.9793 0.0752 19.0
2.7093 54.0 810 1.9763 0.0733 19.0
2.7093 55.0 825 1.9731 0.0733 19.0
2.7093 56.0 840 1.9695 0.0733 19.0
2.7093 57.0 855 1.9666 0.0733 19.0
2.7093 58.0 870 1.9643 0.0733 19.0
2.7093 59.0 885 1.9617 0.0627 19.0
2.7093 60.0 900 1.9590 0.0732 19.0
2.7093 61.0 915 1.9561 0.0626 19.0
2.7093 62.0 930 1.9532 0.0626 19.0
2.7093 63.0 945 1.9509 0.0626 19.0
2.7093 64.0 960 1.9487 0.0626 19.0
2.7093 65.0 975 1.9473 0.0608 19.0
2.7093 66.0 990 1.9454 0.0608 19.0
2.1497 67.0 1005 1.9430 0.0613 19.0
2.1497 68.0 1020 1.9407 0.0613 19.0
2.1497 69.0 1035 1.9389 0.0613 19.0
2.1497 70.0 1050 1.9371 0.0613 19.0
2.1497 71.0 1065 1.9356 0.0613 19.0
2.1497 72.0 1080 1.9341 0.0613 19.0
2.1497 73.0 1095 1.9320 0.0613 19.0
2.1497 74.0 1110 1.9304 0.0681 19.0
2.1497 75.0 1125 1.9290 0.0681 19.0
2.1497 76.0 1140 1.9276 0.0681 19.0
2.1497 77.0 1155 1.9260 0.0681 19.0
2.1497 78.0 1170 1.9248 0.0681 19.0
2.1497 79.0 1185 1.9235 0.0681 19.0
2.1497 80.0 1200 1.9223 0.0681 19.0
2.1497 81.0 1215 1.9213 0.0681 19.0
2.1497 82.0 1230 1.9204 0.0681 19.0
2.1497 83.0 1245 1.9197 0.0681 19.0
2.1497 84.0 1260 1.9190 0.0681 19.0
2.1497 85.0 1275 1.9181 0.069 19.0
2.1497 86.0 1290 1.9175 0.069 19.0
2.1497 87.0 1305 1.9167 0.069 19.0
2.1497 88.0 1320 1.9159 0.069 19.0
2.1497 89.0 1335 1.9152 0.069 19.0
2.1497 90.0 1350 1.9145 0.069 19.0
2.1497 91.0 1365 1.9140 0.069 19.0
2.1497 92.0 1380 1.9136 0.069 19.0
2.1497 93.0 1395 1.9132 0.0681 19.0
2.1497 94.0 1410 1.9130 0.0681 19.0
2.1497 95.0 1425 1.9127 0.0681 19.0
2.1497 96.0 1440 1.9125 0.0681 19.0
2.1497 97.0 1455 1.9123 0.069 19.0
2.1497 98.0 1470 1.9122 0.069 19.0
2.1497 99.0 1485 1.9121 0.0681 19.0
2.0375 100.0 1500 1.9121 0.0681 19.0

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2