Francesco-A's picture
Update README.md
23cd165
metadata
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-en-fr
tags:
  - translation
  - generated_from_trainer
datasets:
  - kde4
metrics:
  - bleu
model-index:
  - name: finetuned-kde4-en-to-fr
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: kde4
          type: kde4
          config: en-fr
          split: train
          args: en-fr
        metrics:
          - name: Bleu
            type: bleu
            value: 52.88529894542656

Model description (finetuned-kde4-en-to-fr)

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-fr on the kde4 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8556
  • Bleu: 52.8853

Intended uses

  • Translation of English text to French
  • Generating coherent and accurate translations in the domain of technical computer science

Limitations

  • The model's performance may degrade when translating sentences with complex or domain-specific terminology that was not present in the training data.
  • It may struggle with idiomatic expressions and cultural nuances that are not captured in the training data.

Training and evaluation data

The model was fine-tuned on the KDE4 dataset, which consists of pairs of sentences in English and their French translations. The dataset contains 189,155 pairs for training and 21,018 pairs for validation.

Training procedure

The model was trained using the Seq2SeqTrainer API from the 🤗 Transformers library. The training procedure involved tokenizing the input English sentences and target French sentences, preparing the data collation for dynamic batching and fine-tuning the model. The evaluation metric used is SacreBLEU.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training details

Here's the data presented in a table format:

Step Training Loss
500 1.423400
1000 1.233600
1500 1.184600
2000 1.125000
2500 1.113000
3000 1.070500
3500 1.063300
4000 1.031900
4500 1.017900
5000 1.008200
5500 1.002500
6000 0.973900
6500 0.907700
7000 0.920600
7500 0.905000
8000 0.900300
8500 0.888500
9000 0.892000
9500 0.881200
10000 0.890200
10500 0.881500
11000 0.876800
11500 0.861000
12000 0.854800
12500 0.819500
13000 0.818100
13500 0.827400
14000 0.806400
14500 0.811000
15000 0.815600
15500 0.818500
16000 0.804800
16500 0.827200
17000 0.808300
17500 0.807600

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3