Francesco-A
/

finetuned-kde4-en-to-fr

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

finetuned-kde4-en-to-fr / README.md

Francesco-A's picture

Update README.md

23cd165 about 1 year ago

|

history blame contribute delete

3.2 kB

	---
	license: apache-2.0
	base_model: Helsinki-NLP/opus-mt-en-fr
	tags:
	- translation
	- generated_from_trainer
	datasets:
	- kde4
	metrics:
	- bleu
	model-index:
	- name: finetuned-kde4-en-to-fr
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: kde4
	type: kde4
	config: en-fr
	split: train
	args: en-fr
	metrics:
	- name: Bleu
	type: bleu
	value: 52.88529894542656
	---

	# Model description (finetuned-kde4-en-to-fr)

	This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co./Helsinki-NLP/opus-mt-en-fr) on the kde4 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8556
	- Bleu: 52.8853

	## Intended uses
	- Translation of English text to French
	- Generating coherent and accurate translations in the domain of technical computer science

	## Limitations
	- The model's performance may degrade when translating sentences with complex or domain-specific terminology that was not present in the training data.
	- It may struggle with idiomatic expressions and cultural nuances that are not captured in the training data.

	## Training and evaluation data

	The model was fine-tuned on the KDE4 dataset, which consists of pairs of sentences in English and their French translations. The dataset contains 189,155 pairs for training and 21,018 pairs for validation.

	## Training procedure

	The model was trained using the Seq2SeqTrainer API from the 🤗 Transformers library. The training procedure involved tokenizing the input English sentences and target French sentences, preparing the data collation for dynamic batching and fine-tuning the model. The evaluation metric used is SacreBLEU.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training details
	Here's the data presented in a table format:

	\| Step \| Training Loss \|
	\|--------\|---------------\|
	\| 500 \| 1.423400 \|
	\| 1000 \| 1.233600 \|
	\| 1500 \| 1.184600 \|
	\| 2000 \| 1.125000 \|
	\| 2500 \| 1.113000 \|
	\| 3000 \| 1.070500 \|
	\| 3500 \| 1.063300 \|
	\| 4000 \| 1.031900 \|
	\| 4500 \| 1.017900 \|
	\| 5000 \| 1.008200 \|
	\| 5500 \| 1.002500 \|
	\| 6000 \| 0.973900 \|
	\| 6500 \| 0.907700 \|
	\| 7000 \| 0.920600 \|
	\| 7500 \| 0.905000 \|
	\| 8000 \| 0.900300 \|
	\| 8500 \| 0.888500 \|
	\| 9000 \| 0.892000 \|
	\| 9500 \| 0.881200 \|
	\| 10000 \| 0.890200 \|
	\| 10500 \| 0.881500 \|
	\| 11000 \| 0.876800 \|
	\| 11500 \| 0.861000 \|
	\| 12000 \| 0.854800 \|
	\| 12500 \| 0.819500 \|
	\| 13000 \| 0.818100 \|
	\| 13500 \| 0.827400 \|
	\| 14000 \| 0.806400 \|
	\| 14500 \| 0.811000 \|
	\| 15000 \| 0.815600 \|
	\| 15500 \| 0.818500 \|
	\| 16000 \| 0.804800 \|
	\| 16500 \| 0.827200 \|
	\| 17000 \| 0.808300 \|
	\| 17500 \| 0.807600 \|


	### Framework versions

	- Transformers 4.31.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.4
	- Tokenizers 0.13.3