File size: 1,626 Bytes
ae2242d
 
ebfeb66
 
 
 
 
dd03f97
d21968e
a213911
 
dd03f97
 
 
 
48bda5e
dd03f97
 
 
 
 
 
a213911
d21968e
dd03f97
 
 
48bda5e
dd03f97
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
library_name: transformers
language:
- en
- ur
metrics:
- bleu
---
## Fine-tuned mBART Model for English to Urdu Translation
This repository contains a fine-tuned mBART model for English to Urdu translation. The model has been trained on a custom dataset and evaluated on test data.



## Model Information

- **Model Name:** `abdulwaheed1/english-to-urdu-translation-mbart`
- **Base Model:** `facebook/mbart-large-50`
- **Tokenizer:** `facebook/mbart-large-50`
- **Source Language:** English (`en`)
- **Target Language:** Urdu (`ur`)

## Usage
```markdown
python
from transformers import MBart50TokenizerFast, MBartForConditionalGeneration

# Load the fine-tuned model
model_name = "abdulwaheed1/english-to-urdu-translation-mbart"
tokenizer = MBart50TokenizerFast.from_pretrained(model_name, src_lang="en_XX", tgt_lang="ur_PK")
model = MBartForConditionalGeneration.from_pretrained(model_name)
```

## Evaluation

The model has been evaluated on a test dataset, and the following metrics were obtained:

- **BLEU Score:** 35.87
- **Generation Length:** 42.56
- **Meteor Score:** 0.60

## Training Details

The model was trained using the `transformers` library with the following configuration:

- **Training Loss:** 1.5697
- **Validation Loss:** 1.1256


## Dataset

The model was fine-tuned on a custom English-Urdu translation dataset. If you wish to use the same dataset, you can find the preprocessing script and dataset files in the `data` directory.

## Acknowledgments

The fine-tuning process and code were inspired by the [Hugging Face Transformers library](https://github.com/huggingface/transformers).



---