romaneng2nep_v3 / README.md

Update README.md

95fce79 verified about 1 month ago

4.29 kB

	---
	base_model: google/mt5-small
	datasets:
	- syubraj/roman2nepali-transliteration
	language:
	- ne
	- en
	library_name: transformers
	license: apache-2.0
	metrics:
	- bleu
	tags:
	- generated_from_trainer
	model-index:
	- name: romaneng2nep_v2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# romaneng2nep_v2

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co./google/mt5-small) on an [syubraj/roman2nepali-transliteration](https://huggingface.co./datasets/syubraj/roman2nepali-transliteration).
	It achieves the following results on the evaluation set:
	- Loss: 2.9652
	- Gen Len: 5.1538


	## MOdel Usage

	```python
	!pip install transformers
	```

	```python
	from transformers import AutoTokenizer, MT5ForConditionalGeneration

	checkpoint = "syubraj/romaneng2nep_v3"
	tokenizer = AutoTokenizer.from_pretrained(checkpoint)
	model = MT5ForConditionalGeneration.from_pretrained(checkpoint)

	# Set max sequence length
	max_seq_len = 20

	def translate(text):
	# Tokenize the input text with a max length of 20
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)

	# Generate translation
	translated = model.generate(**inputs)

	# Decode the translated tokens back to text
	translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
	return translated_text

	# Example usage
	source_text = "muskuraudai" # Example Romanized Nepali text
	translated_text = translate(source_text)
	print(f"Translated Text: {translated_text}")
	```


	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 24
	- eval_batch_size: 24
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 4

	### Training results

	\| Step \| Training Loss \| Validation Loss \| Gen Len \|
	\|--------\|---------------\|-----------------\|----------\|
	\| 1000 \| 15.0703 \| 5.6154 \| 2.3840 \|
	\| 2000 \| 6.0460 \| 4.4449 \| 4.6281 \|
	\| 3000 \| 5.2580 \| 3.9632 \| 4.7790 \|
	\| 4000 \| 4.8563 \| 3.6188 \| 5.0053 \|
	\| 5000 \| 4.5602 \| 3.3491 \| 5.3085 \|
	\| 6000 \| 4.3146 \| 3.1572 \| 5.2562 \|
	\| 7000 \| 4.1228 \| 3.0084 \| 5.2197 \|
	\| 8000 \| 3.9695 \| 2.8727 \| 5.2140 \|
	\| 9000 \| 3.8342 \| 2.7651 \| 5.1834 \|
	\| 10000 \| 3.7319 \| 2.6661 \| 5.1977 \|
	\| 11000 \| 3.6485 \| 2.5864 \| 5.1536 \|
	\| 12000 \| 3.5541 \| 2.5080 \| 5.1990 \|
	\| 13000 \| 3.4959 \| 2.4464 \| 5.1775 \|
	\| 14000 \| 3.4315 \| 2.3931 \| 5.1747 \|
	\| 15000 \| 3.3663 \| 2.3401 \| 5.1625 \|
	\| 16000 \| 3.3204 \| 2.3034 \| 5.1481 \|
	\| 17000 \| 3.2417 \| 2.2593 \| 5.1663 \|
	\| 18000 \| 3.2186 \| 2.2283 \| 5.1351 \|
	\| 19000 \| 3.1822 \| 2.1946 \| 5.1573 \|
	\| 20000 \| 3.1449 \| 2.1690 \| 5.1649 \|
	\| 21000 \| 3.1067 \| 2.1402 \| 5.1624 \|
	\| 22000 \| 3.0844 \| 2.1258 \| 5.1479 \|
	\| 23000 \| 3.0574 \| 2.1066 \| 5.1518 \|
	\| 24000 \| 3.0357 \| 2.0887 \| 5.1446 \|
	\| 25000 \| 3.0136 \| 2.0746 \| 5.1559 \|
	\| 26000 \| 2.9957 \| 2.0609 \| 5.1658 \|
	\| 27000 \| 2.9865 \| 2.0510 \| 5.1791 \|
	\| 28000 \| 2.9765 \| 2.0456 \| 5.1574 \|
	\| 29000 \| 2.9675 \| 2.0386 \| 5.1620 \|
	\| 30000 \| 2.9678 \| 2.0344 \| 5.1601 \|
	\| 31000 \| 2.9652 \| 2.0320 \| 5.1538 \|


	### Framework versions

	- Transformers 4.45.1
	- Pytorch 2.4.0
	- Datasets 3.0.1
	- Tokenizers 0.20.0

	### Citation
	If you find this model useful, please site the work.

	```
	@misc {yubraj_sigdel_2024,
	author = { {Yubraj Sigdel} },
	title = { romaneng2nep_v3 (Revision dca017e) },
	year = 2024,
	url = { https://huggingface.co./syubraj/romaneng2nep_v3 },
	doi = { 10.57967/hf/3252 },
	publisher = { Hugging Face }
	}
	```