vania2911's picture
update model card README.md
ef06c10
|
raw
history blame
5.74 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - bleu
  - rouge
model-index:
  - name: esp-to-lsm-model-split
    results: []

esp-to-lsm-model-split

This model is a fine-tuned version of Helsinki-NLP/opus-mt-es-es on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5690
  • Bleu: 83.5807
  • Rouge: {'rouge1': 0.9265753592812418, 'rouge2': 0.8656694324194325, 'rougeL': 0.9238164847135437, 'rougeLsum': 0.9238003663003664}
  • Ter Score: 10.0090

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.00015
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Rouge Ter Score
0.997 1.0 75 0.7578 74.2121 {'rouge1': 0.8930136077372922, 'rouge2': 0.8132252290193469, 'rougeL': 0.8868313923778324, 'rougeLsum': 0.8866414102466736} 16.3210
0.4353 2.0 150 0.5659 50.7443 {'rouge1': 0.9142509364274071, 'rouge2': 0.83197113997114, 'rougeL': 0.9055773276287983, 'rougeLsum': 0.9062817797670736} 12.8043
0.2602 3.0 225 0.5444 72.0122 {'rouge1': 0.9183889862860454, 'rouge2': 0.8433486969005839, 'rougeL': 0.9132635343958876, 'rougeLsum': 0.913651539908893} 15.9603
0.2316 4.0 300 0.5503 50.9502 {'rouge1': 0.9147289323852568, 'rouge2': 0.8403040453698347, 'rougeL': 0.9084138578656601, 'rougeLsum': 0.9084760810455303} 13.0748
0.1203 5.0 375 0.5211 58.7666 {'rouge1': 0.9278827629661555, 'rouge2': 0.8655444837508406, 'rougeL': 0.922415336132431, 'rougeLsum': 0.9224576705147474} 29.6664
0.1216 6.0 450 0.5491 81.6262 {'rouge1': 0.9206053007450066, 'rouge2': 0.8534470899470898, 'rougeL': 0.9171148252618841, 'rougeLsum': 0.9168772093919156} 11.0911
0.0754 7.0 525 0.5095 83.4616 {'rouge1': 0.9305456776339132, 'rouge2': 0.8778395262145262, 'rougeL': 0.9280110015257075, 'rougeLsum': 0.9281936805025043} 10.0090
0.0848 8.0 600 0.5538 81.8681 {'rouge1': 0.9248025063172123, 'rouge2': 0.8648207579457581, 'rougeL': 0.9219360612154733, 'rougeLsum': 0.921904937654938} 10.4599
0.0504 9.0 675 0.5390 80.8118 {'rouge1': 0.9217618560633272, 'rouge2': 0.8611767121767122, 'rougeL': 0.9194047336106163, 'rougeLsum': 0.9196579346579348} 12.3535
0.0367 10.0 750 0.5632 82.2896 {'rouge1': 0.9241220549602904, 'rouge2': 0.8623059255559258, 'rougeL': 0.921636625901332, 'rougeLsum': 0.9214262796027506} 10.8206
0.0386 11.0 825 0.5325 83.7819 {'rouge1': 0.9264862667289138, 'rouge2': 0.8665701058201061, 'rougeL': 0.924734155278273, 'rougeLsum': 0.9247572857425799} 10.2795
0.0377 12.0 900 0.5540 83.6969 {'rouge1': 0.9270570480717542, 'rouge2': 0.8649807692307694, 'rougeL': 0.9248777127012422, 'rougeLsum': 0.9247459680842035} 10.0090
0.0244 13.0 975 0.5462 83.4825 {'rouge1': 0.9284353783471431, 'rouge2': 0.8673707311207314, 'rougeL': 0.9249773075508372, 'rougeLsum': 0.924672456084221} 9.9188
0.0237 14.0 1050 0.5468 83.3820 {'rouge1': 0.9267599383187618, 'rouge2': 0.8631084656084658, 'rougeL': 0.9244043657867187, 'rougeLsum': 0.9240160215601393} 10.0992
0.0173 15.0 1125 0.5604 82.7936 {'rouge1': 0.9260569985569987, 'rouge2': 0.8652394179894183, 'rougeL': 0.923313301078007, 'rougeLsum': 0.9233026695526696} 10.1894
0.0193 16.0 1200 0.5689 85.1028 {'rouge1': 0.9298936104744928, 'rouge2': 0.874325396825397, 'rougeL': 0.9280833015024192, 'rougeLsum': 0.9275536633845459} 9.6483
0.0184 17.0 1275 0.5695 83.7781 {'rouge1': 0.9266896553881849, 'rouge2': 0.8650757020757022, 'rougeL': 0.924688972247796, 'rougeLsum': 0.9245597692068284} 10.2795
0.0142 18.0 1350 0.5655 83.6649 {'rouge1': 0.925748337718926, 'rouge2': 0.8645625300625305, 'rougeL': 0.9233836055012529, 'rougeLsum': 0.9233253614577146} 10.0090
0.0131 19.0 1425 0.5701 83.6843 {'rouge1': 0.9268515199397553, 'rouge2': 0.8660478595478597, 'rougeL': 0.9242069248833956, 'rougeLsum': 0.9242629070276129} 9.9188
0.0122 20.0 1500 0.5690 83.5807 {'rouge1': 0.9265753592812418, 'rouge2': 0.8656694324194325, 'rougeL': 0.9238164847135437, 'rougeLsum': 0.9238003663003664} 10.0090

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.2
  • Tokenizers 0.13.3