|
--- |
|
library_name: transformers |
|
language: |
|
- en |
|
- ur |
|
metrics: |
|
- bleu |
|
--- |
|
##Fine-tuned mBART Model for English to Urdu Translation |
|
This repository contains a fine-tuned mBART model for English to Urdu translation. The model has been trained on a custom dataset and evaluated on test data. |
|
|
|
|
|
|
|
|
|
## Model Information |
|
|
|
- **Model Name:** `abdulwaheed63/mbart_en_ur_finetuned` |
|
- **Base Model:** `facebook/mbart-large-50` |
|
- **Tokenizer:** `facebook/mbart-large-50` |
|
- **Source Language:** English (`en`) |
|
- **Target Language:** Urdu (`ur`) |
|
|
|
## Usage |
|
```markdown |
|
```python |
|
from transformers import MBart50TokenizerFast, MBartForConditionalGeneration |
|
|
|
# Load the fine-tuned model |
|
model_name = "abdulwaheed63/mbart_en_ur_finetuned" |
|
tokenizer = MBart50TokenizerFast.from_pretrained(model_name, src_lang="en_XX", tgt_lang="ur_PK") |
|
model = MBartForConditionalGeneration.from_pretrained(model_name) |
|
``` |
|
|
|
## Evaluation |
|
|
|
The model has been evaluated on a test dataset, and the following metrics were obtained: |
|
|
|
- **BLEU Score:** 35.87 |
|
- **Generation Length:** 42.56 |
|
- **Meteor Score:** 0.60 |
|
|
|
## Training Details |
|
|
|
The model was trained using the `transformers` library with the following configuration: |
|
|
|
- **Training Loss:** 1.5697 |
|
- **Validation Loss:** 1.1256 |
|
|
|
|
|
## Dataset |
|
|
|
The model was fine-tuned on a custom English-Urdu translation dataset. If you wish to use the same dataset, you can find the preprocessing script and dataset files in the `data` directory. |
|
|
|
## Acknowledgments |
|
|
|
The fine-tuning process and code were inspired by the [Hugging Face Transformers library](https://github.com/huggingface/transformers). |
|
|
|
|
|
|
|
--- |
|
|