File size: 4,288 Bytes
f4bb0d1
57e1550
 
 
 
 
 
f4bb0d1
 
 
 
57e1550
 
f4bb0d1
 
 
 
 
 
 
 
 
 
8357ee4
f4bb0d1
f65b190
 
f4bb0d1
 
64ed97c
 
 
 
 
f4bb0d1
abf8f60
 
 
dca017e
abf8f60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f4bb0d1
abf8f60
 
 
 
 
f4bb0d1
 
 
 
 
 
 
 
 
 
 
 
 
27814b2
f4bb0d1
 
 
abf8f60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f4bb0d1
 
 
 
 
 
 
5b213fb
 
95fce79
 
 
5a43276
5b213fb
 
 
 
 
 
 
5a43276
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
base_model: google/mt5-small
datasets:
- syubraj/roman2nepali-transliteration
language:
- ne
- en
library_name: transformers
license: apache-2.0
metrics:
- bleu
tags:
- generated_from_trainer
model-index:
- name: romaneng2nep_v2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# romaneng2nep_v2

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co./google/mt5-small) on an [syubraj/roman2nepali-transliteration](https://huggingface.co./datasets/syubraj/roman2nepali-transliteration).
It achieves the following results on the evaluation set:
- Loss: 2.9652
- Gen Len: 5.1538


## MOdel Usage 

```python
!pip install transformers
```

```python
from transformers import AutoTokenizer, MT5ForConditionalGeneration

checkpoint = "syubraj/romaneng2nep_v3"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)

# Set max sequence length
max_seq_len = 20

def translate(text):
    # Tokenize the input text with a max length of 20
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)

    # Generate translation
    translated = model.generate(**inputs)

    # Decode the translated tokens back to text
    translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
    return translated_text

# Example usage
source_text = "muskuraudai"  # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
```


## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4

### Training results

| Step   | Training Loss | Validation Loss | Gen Len  |
|--------|---------------|-----------------|----------|
| 1000   | 15.0703       | 5.6154          | 2.3840   |
| 2000   | 6.0460        | 4.4449          | 4.6281   |
| 3000   | 5.2580        | 3.9632          | 4.7790   |
| 4000   | 4.8563        | 3.6188          | 5.0053   |
| 5000   | 4.5602        | 3.3491          | 5.3085   |
| 6000   | 4.3146        | 3.1572          | 5.2562   |
| 7000   | 4.1228        | 3.0084          | 5.2197   |
| 8000   | 3.9695        | 2.8727          | 5.2140   |
| 9000   | 3.8342        | 2.7651          | 5.1834   |
| 10000  | 3.7319        | 2.6661          | 5.1977   |
| 11000  | 3.6485        | 2.5864          | 5.1536   |
| 12000  | 3.5541        | 2.5080          | 5.1990   |
| 13000  | 3.4959        | 2.4464          | 5.1775   |
| 14000  | 3.4315        | 2.3931          | 5.1747   |
| 15000  | 3.3663        | 2.3401          | 5.1625   |
| 16000  | 3.3204        | 2.3034          | 5.1481   |
| 17000  | 3.2417        | 2.2593          | 5.1663   |
| 18000  | 3.2186        | 2.2283          | 5.1351   |
| 19000  | 3.1822        | 2.1946          | 5.1573   |
| 20000  | 3.1449        | 2.1690          | 5.1649   |
| 21000  | 3.1067        | 2.1402          | 5.1624   |
| 22000  | 3.0844        | 2.1258          | 5.1479   |
| 23000  | 3.0574        | 2.1066          | 5.1518   |
| 24000  | 3.0357        | 2.0887          | 5.1446   |
| 25000  | 3.0136        | 2.0746          | 5.1559   |
| 26000  | 2.9957        | 2.0609          | 5.1658   |
| 27000  | 2.9865        | 2.0510          | 5.1791   |
| 28000  | 2.9765        | 2.0456          | 5.1574   |
| 29000  | 2.9675        | 2.0386          | 5.1620   |
| 30000  | 2.9678        | 2.0344          | 5.1601   |
| 31000  | 2.9652        | 2.0320          | 5.1538   |


### Framework versions

- Transformers 4.45.1
- Pytorch 2.4.0
- Datasets 3.0.1
- Tokenizers 0.20.0

### Citation
If you find this model useful, please site the work.

```
@misc {yubraj_sigdel_2024,
	author       = { {Yubraj Sigdel} },
	title        = { romaneng2nep_v3 (Revision dca017e) },
	year         = 2024,
	url          = { https://huggingface.co./syubraj/romaneng2nep_v3 },
	doi          = { 10.57967/hf/3252 },
	publisher    = { Hugging Face }
}
```