Edit model card

mT5-TextSimp-LT-BatchSize2-lr1e-4

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0672
  • Rouge1: 0.7548
  • Rouge2: 0.5989
  • Rougel: 0.7509
  • Sacrebleu: 49.0373
  • Gen Len: 38.0501

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Sacrebleu Gen Len
25.6783 0.24 200 16.0497 0.0109 0.0005 0.0107 0.0029 512.0
1.9593 0.48 400 0.7780 0.014 0.0005 0.0136 0.0146 42.685
0.2778 0.72 600 0.1429 0.4924 0.3128 0.4803 20.3057 38.0382
0.1325 0.96 800 0.1039 0.6193 0.4369 0.6098 33.687 38.0501
0.1702 1.2 1000 0.0958 0.6697 0.5016 0.6613 38.0391 38.0501
0.13 1.44 1200 0.0880 0.6737 0.5051 0.6644 38.62 38.0501
0.1086 1.67 1400 0.0839 0.6964 0.5326 0.6884 40.9056 38.0501
0.0716 1.91 1600 0.0859 0.6933 0.5298 0.6862 40.7158 38.0501
0.1135 2.15 1800 0.0820 0.7017 0.5366 0.6936 40.7484 38.0501
0.0997 2.39 2000 0.0814 0.7011 0.5351 0.6945 41.1948 38.0501
0.0996 2.63 2200 0.0774 0.7103 0.5522 0.7049 42.5756 38.0501
1.1379 2.87 2400 0.0763 0.7211 0.5556 0.7152 43.2411 38.0501
0.0594 3.11 2600 0.0776 0.7261 0.5647 0.7201 44.2205 38.0501
0.0763 3.35 2800 0.0736 0.7309 0.5709 0.7251 45.2825 38.0501
0.1641 3.59 3000 0.0722 0.7297 0.5685 0.7242 44.9001 38.0501
0.1085 3.83 3200 0.0703 0.7377 0.5793 0.7319 45.7504 38.0501
0.0573 4.07 3400 0.0719 0.7393 0.5796 0.7335 45.86 38.0501
0.1149 4.31 3600 0.0705 0.7415 0.5787 0.7365 46.2652 38.0501
0.0843 4.55 3800 0.0703 0.7385 0.5754 0.7326 46.5292 38.0501
0.0658 4.78 4000 0.0705 0.7437 0.5855 0.7384 46.864 38.0501
0.0676 5.02 4200 0.0694 0.7437 0.584 0.7384 47.1268 38.0501
0.0657 5.26 4400 0.0711 0.7473 0.5913 0.7432 47.4413 38.0501
0.0679 5.5 4600 0.0702 0.7496 0.5908 0.7446 47.8281 38.0501
0.0664 5.74 4800 0.0671 0.7511 0.5929 0.7463 47.7693 38.0501
0.0446 5.98 5000 0.0685 0.7533 0.5932 0.7478 48.032 38.0501
0.0732 6.22 5200 0.0678 0.7523 0.5948 0.7472 48.3467 38.0501
0.0706 6.46 5400 0.0672 0.755 0.5983 0.7507 48.6158 38.0501
0.051 6.7 5600 0.0674 0.7523 0.5961 0.7478 48.4828 38.0501
0.067 6.94 5800 0.0681 0.7532 0.5978 0.7492 48.7253 38.0501
0.075 7.18 6000 0.0684 0.7534 0.5969 0.7492 48.7053 38.0501
0.1323 7.42 6200 0.0671 0.755 0.5991 0.7511 48.9922 38.0501
0.0383 7.66 6400 0.0671 0.7551 0.5994 0.7511 49.0028 38.0501
0.0599 7.89 6600 0.0672 0.7548 0.5989 0.7509 49.0373 38.0501

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for eglkan1/mT5-TextSimp-LT-BatchSize2-lr1e-4

Base model

google/mt5-base
Finetuned
(156)
this model