|
--- |
|
license: apache-2.0 |
|
base_model: Helsinki-NLP/opus-mt-en-fr |
|
tags: |
|
- translation, supervised, kde4 |
|
- generated_from_trainer |
|
datasets: |
|
- kde4 |
|
metrics: |
|
- bleu |
|
model-index: |
|
- name: marian-finetuned-kde4-en-to-fr |
|
results: |
|
- task: |
|
name: Sequence-to-sequence Language Modeling |
|
type: text2text-generation |
|
dataset: |
|
name: kde4 |
|
type: kde4 |
|
config: en-fr |
|
split: train |
|
args: en-fr |
|
metrics: |
|
- name: Bleu |
|
type: bleu |
|
value: 49.64800786424299 |
|
language: |
|
- en |
|
pipeline_tag: translation |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# marian-finetuned-kde4-en-to-fr |
|
|
|
This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co./Helsinki-NLP/opus-mt-en-fr), specifically tailored for English-to-French translation tasks. It was trained on the `kde4` dataset, which consists of parallel texts from the KDE project, making it highly specialized in technical and software documentation translation. |
|
|
|
## Model Description |
|
|
|
MarianMT is a neural machine translation model based on the Marian framework, designed for rapid training and inference. This particular model, `marian-finetuned-kde4-en-to-fr`, leverages the capabilities of the pre-trained `opus-mt-en-fr` model and further enhances its performance on the KDE4 dataset, which is focused on the translation of software and technical documentation. |
|
|
|
### Key Features: |
|
- **Base Model**: [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co./Helsinki-NLP/opus-mt-en-fr), a robust English-to-French translation model. |
|
- **Fine-Tuned For**: Specialized translation of technical and software documentation. |
|
- **Architecture**: Transformer-based MarianMT, known for efficient and scalable translation capabilities. |
|
|
|
## Intended Uses & Limitations |
|
|
|
### Intended Uses: |
|
- **Technical Documentation Translation**: Translate software documentation, user manuals, and other technical texts from English to French. |
|
- **Software Localization**: Aid in the localization process by translating software interfaces and messages. |
|
- **General English-to-French Translation**: While specialized for technical texts, it can also handle general translation tasks. |
|
|
|
### Limitations: |
|
- **Domain-Specific Performance**: The model's fine-tuning on technical texts means it excels in those areas but may not perform as well with colloquial language or literary texts. |
|
- **Biases**: The model may reflect biases present in the training data, particularly around technical jargon and software terminology. |
|
- **Limited Language Support**: This model is designed specifically for English-to-French translation. It is not suitable for other language pairs without further fine-tuning. |
|
|
|
## Training and Evaluation Data |
|
|
|
### Dataset: |
|
- **Training Data**: The `kde4` dataset, which includes parallel English-French sentences derived from the KDE project. This dataset primarily consists of translations relevant to software documentation, user interfaces, and related technical content. |
|
- **Evaluation Data**: A subset of the `kde4` dataset was used for evaluation to ensure the model's effectiveness in the same domain it was trained on. |
|
|
|
### Data Characteristics: |
|
- **Domain**: Technical documentation, software localization. |
|
- **Language**: Primarily English and French. |
|
- **Dataset Size**: Contains thousands of sentence pairs, providing a robust dataset for fine-tuning in the technical domain. |
|
|
|
## Training Procedure |
|
|
|
### Training Hyperparameters: |
|
- **Learning Rate**: 2e-05 |
|
- **Train Batch Size**: 32 |
|
- **Eval Batch Size**: 64 |
|
- **Seed**: 42 |
|
- **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08 |
|
- **Learning Rate Scheduler Type**: Linear |
|
- **Number of Epochs**: 3 |
|
- **Mixed Precision Training**: Native AMP (Automatic Mixed Precision) to optimize training time and memory usage. |
|
|
|
### Training Results: |
|
|
|
| Metric | Value | |
|
|---------------|----------| |
|
| Training Loss | 1.0371 | |
|
| Evaluation Loss | 1.0371 | |
|
| BLEU Score | 49.6480 | |
|
|
|
- **Final Evaluation Loss**: 1.0371 |
|
- **BLEU Score**: 49.6480, indicating a high level of accuracy in translation. |
|
|
|
### Framework Versions |
|
- **Transformers**: 4.42.4 |
|
- **PyTorch**: 2.3.1+cu121 |
|
- **Datasets**: 2.21.0 |
|
- **Tokenizers**: 0.19.1 |
|
|
|
## Usage |
|
|
|
You can use this model in a Hugging Face pipeline for translation tasks: |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
model_checkpoint = "ashaduzzaman/marian-finetuned-kde4-en-to-fr" |
|
translator = pipeline("translation", model=model_checkpoint) |
|
|
|
# Example usage |
|
input_text = "The user manual provides detailed instructions on how to use the software." |
|
translation = translator(input_text) |
|
print(translation) |
|
``` |
|
|
|
## Acknowledgments |
|
|
|
This model was developed using the [Hugging Face Transformers](https://huggingface.co./transformers) library and fine-tuned using the `kde4` dataset. Special thanks to the contributors of the KDE project for providing a rich source of multilingual technical content. |