esp-to-lsm-model

This model is a fine-tuned version of Helsinki-NLP/opus-mt-es-es on a Spanish-MSL glosses dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5224
  • Bleu: 74.2913
  • Rouge: {'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139}
  • Ter Score: 14.6840

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.5e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Rouge Ter Score
2.5487 1.0 75 1.8275 33.3311 {'rouge1': 0.7125697572837667, 'rouge2': 0.5131076015487782, 'rougeL': 0.6740261156112557, 'rougeLsum': 0.6730658531068747} 48.9777
1.417 2.0 150 1.2236 58.3622 {'rouge1': 0.8070335129553401, 'rouge2': 0.6696746733658498, 'rougeL': 0.7904133765844297, 'rougeLsum': 0.7895317227205776} 29.4610
0.9666 3.0 225 0.9751 68.5295 {'rouge1': 0.8502113964466904, 'rouge2': 0.7350681448181451, 'rougeL': 0.8411302357772945, 'rougeLsum': 0.8410883914560386} 21.4684
0.8217 4.0 300 0.8450 44.5871 {'rouge1': 0.8678535408519932, 'rouge2': 0.7697804232804234, 'rougeL': 0.8597202956428964, 'rougeLsum': 0.8600501068132649} 30.2974
0.7691 5.0 375 0.7586 45.8903 {'rouge1': 0.8777863634187164, 'rouge2': 0.7896996151996154, 'rougeL': 0.8714760522701701, 'rougeLsum': 0.8710761150614097} 28.8104
0.5557 6.0 450 0.6913 60.0358 {'rouge1': 0.8811041790453555, 'rouge2': 0.8024246031746034, 'rougeL': 0.8775582647200295, 'rougeLsum': 0.8773233525733528} 21.2825
0.5462 7.0 525 0.6471 59.0748 {'rouge1': 0.8826582635813243, 'rouge2': 0.8028015873015873, 'rougeL': 0.8787765851180174, 'rougeLsum': 0.8785213589101055} 21.8401
0.4446 8.0 600 0.6160 40.9211 {'rouge1': 0.8939967405639866, 'rouge2': 0.8149416786916788, 'rougeL': 0.8905721678257397, 'rougeLsum': 0.890523253679749} 30.8550
0.3959 9.0 675 0.5945 42.2774 {'rouge1': 0.894224230018348, 'rouge2': 0.8151240981240981, 'rougeL': 0.8909062049062051, 'rougeLsum': 0.8915671958760194} 30.1115
0.3249 10.0 750 0.5759 70.2959 {'rouge1': 0.9012842030237667, 'rouge2': 0.8230316257816259, 'rougeL': 0.8965130854983795, 'rougeLsum': 0.8970404413388284} 16.7286
0.3459 11.0 825 0.5514 43.2915 {'rouge1': 0.90225049025049, 'rouge2': 0.8307122122122121, 'rougeL': 0.8987950948833301, 'rougeLsum': 0.8987281601840429} 28.9033
0.3153 12.0 900 0.5405 44.9816 {'rouge1': 0.9047931538206682, 'rouge2': 0.8333689107827039, 'rougeL': 0.9006491566975439, 'rougeLsum': 0.9009697546988817} 27.5093
0.2851 13.0 975 0.5381 72.0806 {'rouge1': 0.9056758296170062, 'rouge2': 0.8312087542087543, 'rougeL': 0.9011036006477184, 'rougeLsum': 0.9014392073068547} 15.7063
0.2526 14.0 1050 0.5349 75.0117 {'rouge1': 0.90289756104462, 'rouge2': 0.8248306878306879, 'rougeL': 0.898266601590131, 'rougeLsum': 0.8983403573550632} 14.9628
0.2209 15.0 1125 0.5281 74.3845 {'rouge1': 0.9036245755878107, 'rouge2': 0.8278015873015876, 'rougeL': 0.8997443447075799, 'rougeLsum': 0.8999785990153637} 14.7770
0.2668 16.0 1200 0.5265 74.2756 {'rouge1': 0.9030526660159015, 'rouge2': 0.8251984126984128, 'rougeL': 0.8979846999405824, 'rougeLsum': 0.8985619854002207} 14.8699
0.2314 17.0 1275 0.5258 74.5417 {'rouge1': 0.9059293459808169, 'rouge2': 0.8316084656084658, 'rougeL': 0.9013539031774327, 'rougeLsum': 0.9015474139150612} 14.5911
0.2069 18.0 1350 0.5225 74.5623 {'rouge1': 0.9067485180941064, 'rouge2': 0.8356613756613757, 'rougeL': 0.9022319058936705, 'rougeLsum': 0.9027956773618538} 14.6840
0.187 19.0 1425 0.5225 74.2989 {'rouge1': 0.9060216096539625, 'rouge2': 0.832691798941799, 'rougeL': 0.9016076450782335, 'rougeLsum': 0.9017442739722153} 14.7770
0.2413 20.0 1500 0.5224 74.2913 {'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139} 14.6840

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.13.3
Downloads last month
5
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.