6000

This model is a fine-tuned version of Helsinki-NLP/opus-mt-es-es on the English-ASL glosses dataset and spanish-MSL glosses dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2172
  • Model Preparation Time: 0.0055
  • Bleu Msl: 90.1233
  • Bleu Asl: 0
  • Ter Msl: 6.3523
  • Ter Asl: 100

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Bleu Msl Bleu Asl Ter Msl Ter Asl
No log 1.0 150 2.1707 0.0055 6.2392 23.0477 102.9488 79.8599
No log 2.0 300 1.5545 0.0055 12.5922 39.9535 98.5256 65.6743
No log 3.0 450 1.1397 0.0055 53.7526 56.8191 35.1258 34.5009
2.0124 4.0 600 0.8560 0.0055 61.3665 58.7258 26.5395 33.1874
2.0124 5.0 750 0.6526 0.0055 66.5262 60.0265 23.0703 32.7496
2.0124 6.0 900 0.5405 0.0055 42.9331 69.2419 32.0035 20.7531
0.7028 7.0 1050 0.4769 0.0055 62.3002 73.9626 22.4631 16.9002
0.7028 8.0 1200 0.4427 0.0055 72.1107 84.5297 16.9991 8.0560
0.7028 9.0 1350 0.4153 0.0055 74.8931 83.3473 16.1318 9.3695
0.3613 10.0 1500 0.3961 0.0055 74.8480 85.2817 15.0911 8.4063
0.3613 11.0 1650 0.3794 0.0055 75.4490 84.4129 15.1778 8.4939
0.3613 12.0 1800 0.3563 0.0055 76.8164 86.0871 13.7901 7.6182
0.3613 13.0 1950 0.3277 0.0055 78.1801 85.5103 13.5299 7.9685
0.2432 14.0 2100 0.3070 0.0055 78.6695 86.9448 13.2697 7.5306
0.2432 15.0 2250 0.2972 0.0055 77.6165 86.1077 13.5299 7.7058
0.2432 16.0 2400 0.2917 0.0055 78.0856 86.5549 13.0095 7.5306
0.1592 17.0 2550 0.2855 0.0055 77.3656 86.8329 13.2697 7.2680
0.1592 18.0 2700 0.2804 0.0055 78.3336 54.3313 12.9228 50.6130
0.1592 19.0 2850 0.2791 0.0055 77.4559 87.1498 13.2697 7.2680
0.1188 20.0 3000 0.2749 0.0055 78.0130 54.3314 12.9228 51.0508
0.1188 21.0 3150 0.2717 0.0055 78.2950 53.8351 12.5759 51.3135
0.1188 22.0 3300 0.2710 0.0055 78.2663 54.3637 12.6626 50.8757
0.1188 23.0 3450 0.2686 0.0055 77.8997 54.1712 13.0095 50.8757
0.0992 24.0 3600 0.2669 0.0055 79.2314 54.3758 12.2290 51.0508
0.0992 25.0 3750 0.2656 0.0055 78.2862 54.2965 12.5759 51.1384
0.0992 26.0 3900 0.2643 0.0055 78.9745 54.2624 12.4024 50.8757
0.0889 27.0 4050 0.2651 0.0055 79.3330 54.3778 12.1422 50.9632
0.0889 28.0 4200 0.2643 0.0055 78.4405 54.3716 12.3157 50.9632
0.0889 29.0 4350 0.2639 0.0055 78.5866 54.3716 12.3157 50.9632
0.083 30.0 4500 0.2639 0.0055 78.5866 54.3716 12.3157 50.9632

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
25
Safetensors
Model size
61.2M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for vania2911/6000

Finetuned
(15)
this model