esp-to-lsm-model
This model is a fine-tuned version of Helsinki-NLP/opus-mt-es-es on a Spanish-MSL glosses dataset. It achieves the following results on the evaluation set:
- Loss: 0.5224
- Bleu: 74.2913
- Rouge: {'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139}
- Ter Score: 14.6840
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1.5e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Ter Score |
---|---|---|---|---|---|---|
2.5487 | 1.0 | 75 | 1.8275 | 33.3311 | {'rouge1': 0.7125697572837667, 'rouge2': 0.5131076015487782, 'rougeL': 0.6740261156112557, 'rougeLsum': 0.6730658531068747} | 48.9777 |
1.417 | 2.0 | 150 | 1.2236 | 58.3622 | {'rouge1': 0.8070335129553401, 'rouge2': 0.6696746733658498, 'rougeL': 0.7904133765844297, 'rougeLsum': 0.7895317227205776} | 29.4610 |
0.9666 | 3.0 | 225 | 0.9751 | 68.5295 | {'rouge1': 0.8502113964466904, 'rouge2': 0.7350681448181451, 'rougeL': 0.8411302357772945, 'rougeLsum': 0.8410883914560386} | 21.4684 |
0.8217 | 4.0 | 300 | 0.8450 | 44.5871 | {'rouge1': 0.8678535408519932, 'rouge2': 0.7697804232804234, 'rougeL': 0.8597202956428964, 'rougeLsum': 0.8600501068132649} | 30.2974 |
0.7691 | 5.0 | 375 | 0.7586 | 45.8903 | {'rouge1': 0.8777863634187164, 'rouge2': 0.7896996151996154, 'rougeL': 0.8714760522701701, 'rougeLsum': 0.8710761150614097} | 28.8104 |
0.5557 | 6.0 | 450 | 0.6913 | 60.0358 | {'rouge1': 0.8811041790453555, 'rouge2': 0.8024246031746034, 'rougeL': 0.8775582647200295, 'rougeLsum': 0.8773233525733528} | 21.2825 |
0.5462 | 7.0 | 525 | 0.6471 | 59.0748 | {'rouge1': 0.8826582635813243, 'rouge2': 0.8028015873015873, 'rougeL': 0.8787765851180174, 'rougeLsum': 0.8785213589101055} | 21.8401 |
0.4446 | 8.0 | 600 | 0.6160 | 40.9211 | {'rouge1': 0.8939967405639866, 'rouge2': 0.8149416786916788, 'rougeL': 0.8905721678257397, 'rougeLsum': 0.890523253679749} | 30.8550 |
0.3959 | 9.0 | 675 | 0.5945 | 42.2774 | {'rouge1': 0.894224230018348, 'rouge2': 0.8151240981240981, 'rougeL': 0.8909062049062051, 'rougeLsum': 0.8915671958760194} | 30.1115 |
0.3249 | 10.0 | 750 | 0.5759 | 70.2959 | {'rouge1': 0.9012842030237667, 'rouge2': 0.8230316257816259, 'rougeL': 0.8965130854983795, 'rougeLsum': 0.8970404413388284} | 16.7286 |
0.3459 | 11.0 | 825 | 0.5514 | 43.2915 | {'rouge1': 0.90225049025049, 'rouge2': 0.8307122122122121, 'rougeL': 0.8987950948833301, 'rougeLsum': 0.8987281601840429} | 28.9033 |
0.3153 | 12.0 | 900 | 0.5405 | 44.9816 | {'rouge1': 0.9047931538206682, 'rouge2': 0.8333689107827039, 'rougeL': 0.9006491566975439, 'rougeLsum': 0.9009697546988817} | 27.5093 |
0.2851 | 13.0 | 975 | 0.5381 | 72.0806 | {'rouge1': 0.9056758296170062, 'rouge2': 0.8312087542087543, 'rougeL': 0.9011036006477184, 'rougeLsum': 0.9014392073068547} | 15.7063 |
0.2526 | 14.0 | 1050 | 0.5349 | 75.0117 | {'rouge1': 0.90289756104462, 'rouge2': 0.8248306878306879, 'rougeL': 0.898266601590131, 'rougeLsum': 0.8983403573550632} | 14.9628 |
0.2209 | 15.0 | 1125 | 0.5281 | 74.3845 | {'rouge1': 0.9036245755878107, 'rouge2': 0.8278015873015876, 'rougeL': 0.8997443447075799, 'rougeLsum': 0.8999785990153637} | 14.7770 |
0.2668 | 16.0 | 1200 | 0.5265 | 74.2756 | {'rouge1': 0.9030526660159015, 'rouge2': 0.8251984126984128, 'rougeL': 0.8979846999405824, 'rougeLsum': 0.8985619854002207} | 14.8699 |
0.2314 | 17.0 | 1275 | 0.5258 | 74.5417 | {'rouge1': 0.9059293459808169, 'rouge2': 0.8316084656084658, 'rougeL': 0.9013539031774327, 'rougeLsum': 0.9015474139150612} | 14.5911 |
0.2069 | 18.0 | 1350 | 0.5225 | 74.5623 | {'rouge1': 0.9067485180941064, 'rouge2': 0.8356613756613757, 'rougeL': 0.9022319058936705, 'rougeLsum': 0.9027956773618538} | 14.6840 |
0.187 | 19.0 | 1425 | 0.5225 | 74.2989 | {'rouge1': 0.9060216096539625, 'rouge2': 0.832691798941799, 'rougeL': 0.9016076450782335, 'rougeLsum': 0.9017442739722153} | 14.7770 |
0.2413 | 20.0 | 1500 | 0.5224 | 74.2913 | {'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139} | 14.6840 |
Framework versions
- Transformers 4.26.1
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.13.3
- Downloads last month
- 5
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.