esp-to-lsm-model

This model is a fine-tuned version of Helsinki-NLP/opus-mt-es-es on a Spanish-MSL glosses dataset. It achieves the following results on the evaluation set:

Loss: 0.5224
Bleu: 74.2913
Rouge: {'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139}
Ter Score: 14.6840

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.5e-05
train_batch_size: 32
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Rouge	Ter Score
2.5487	1.0	75	1.8275	33.3311	{'rouge1': 0.7125697572837667, 'rouge2': 0.5131076015487782, 'rougeL': 0.6740261156112557, 'rougeLsum': 0.6730658531068747}	48.9777
1.417	2.0	150	1.2236	58.3622	{'rouge1': 0.8070335129553401, 'rouge2': 0.6696746733658498, 'rougeL': 0.7904133765844297, 'rougeLsum': 0.7895317227205776}	29.4610
0.9666	3.0	225	0.9751	68.5295	{'rouge1': 0.8502113964466904, 'rouge2': 0.7350681448181451, 'rougeL': 0.8411302357772945, 'rougeLsum': 0.8410883914560386}	21.4684
0.8217	4.0	300	0.8450	44.5871	{'rouge1': 0.8678535408519932, 'rouge2': 0.7697804232804234, 'rougeL': 0.8597202956428964, 'rougeLsum': 0.8600501068132649}	30.2974
0.7691	5.0	375	0.7586	45.8903	{'rouge1': 0.8777863634187164, 'rouge2': 0.7896996151996154, 'rougeL': 0.8714760522701701, 'rougeLsum': 0.8710761150614097}	28.8104
0.5557	6.0	450	0.6913	60.0358	{'rouge1': 0.8811041790453555, 'rouge2': 0.8024246031746034, 'rougeL': 0.8775582647200295, 'rougeLsum': 0.8773233525733528}	21.2825
0.5462	7.0	525	0.6471	59.0748	{'rouge1': 0.8826582635813243, 'rouge2': 0.8028015873015873, 'rougeL': 0.8787765851180174, 'rougeLsum': 0.8785213589101055}	21.8401
0.4446	8.0	600	0.6160	40.9211	{'rouge1': 0.8939967405639866, 'rouge2': 0.8149416786916788, 'rougeL': 0.8905721678257397, 'rougeLsum': 0.890523253679749}	30.8550
0.3959	9.0	675	0.5945	42.2774	{'rouge1': 0.894224230018348, 'rouge2': 0.8151240981240981, 'rougeL': 0.8909062049062051, 'rougeLsum': 0.8915671958760194}	30.1115
0.3249	10.0	750	0.5759	70.2959	{'rouge1': 0.9012842030237667, 'rouge2': 0.8230316257816259, 'rougeL': 0.8965130854983795, 'rougeLsum': 0.8970404413388284}	16.7286
0.3459	11.0	825	0.5514	43.2915	{'rouge1': 0.90225049025049, 'rouge2': 0.8307122122122121, 'rougeL': 0.8987950948833301, 'rougeLsum': 0.8987281601840429}	28.9033
0.3153	12.0	900	0.5405	44.9816	{'rouge1': 0.9047931538206682, 'rouge2': 0.8333689107827039, 'rougeL': 0.9006491566975439, 'rougeLsum': 0.9009697546988817}	27.5093
0.2851	13.0	975	0.5381	72.0806	{'rouge1': 0.9056758296170062, 'rouge2': 0.8312087542087543, 'rougeL': 0.9011036006477184, 'rougeLsum': 0.9014392073068547}	15.7063
0.2526	14.0	1050	0.5349	75.0117	{'rouge1': 0.90289756104462, 'rouge2': 0.8248306878306879, 'rougeL': 0.898266601590131, 'rougeLsum': 0.8983403573550632}	14.9628
0.2209	15.0	1125	0.5281	74.3845	{'rouge1': 0.9036245755878107, 'rouge2': 0.8278015873015876, 'rougeL': 0.8997443447075799, 'rougeLsum': 0.8999785990153637}	14.7770
0.2668	16.0	1200	0.5265	74.2756	{'rouge1': 0.9030526660159015, 'rouge2': 0.8251984126984128, 'rougeL': 0.8979846999405824, 'rougeLsum': 0.8985619854002207}	14.8699
0.2314	17.0	1275	0.5258	74.5417	{'rouge1': 0.9059293459808169, 'rouge2': 0.8316084656084658, 'rougeL': 0.9013539031774327, 'rougeLsum': 0.9015474139150612}	14.5911
0.2069	18.0	1350	0.5225	74.5623	{'rouge1': 0.9067485180941064, 'rouge2': 0.8356613756613757, 'rougeL': 0.9022319058936705, 'rougeLsum': 0.9027956773618538}	14.6840
0.187	19.0	1425	0.5225	74.2989	{'rouge1': 0.9060216096539625, 'rouge2': 0.832691798941799, 'rougeL': 0.9016076450782335, 'rougeLsum': 0.9017442739722153}	14.7770
0.2413	20.0	1500	0.5224	74.2913	{'rouge1': 0.9064168152109326, 'rouge2': 0.8341349206349207, 'rougeL': 0.9018725808505224, 'rougeLsum': 0.9021191961633139}	14.6840

Framework versions

Transformers 4.26.1
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.13.3

vania2911
/

esp-to-lsm-model

esp-to-lsm-model

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results