mansoorhamidzadeh's picture
Update README.md
a36a50c verified
---
library_name: transformers
license: mit
language:
- fa
tags:
- persian
- mt5-small
- mt5
- persian translation
- seq2seq
- farsi
---
# Model Card: English to Persian Translation using MT5-Small
## Model Details
**Model Description:**
This model is designed to translate text from English to Persian (Farsi) using the MT5-Small architecture. MT5 is a multilingual variant of the T5 model, pretrained on a diverse set of languages.
**Intended Use:**
The model is intended for use in applications where automatic translation from English to Persian is required. It can be used for translating documents, web pages, or any other text-based content.
**Model Architecture:**
- **Model Type:** MT5-Small
- **Language Pair:** English (en) to Persian (fa)
## Training Data
**Dataset:**
The model was trained on a dataset consisting of 100,000 parallel sentences of English and Persian text. The data includes various sources to cover a wide range of topics and ensure diversity.
**Data Preprocessing:**
- Text normalization was performed to ensure consistency.
- Tokenization was done using the SentencePiece tokenizer.
## Training Procedure
**Training Configuration:**
- **Number of Epochs:** 4
- **Batch Size:** 8
- **Learning Rate:** 5e-5
- **Optimizer:** AdamW
**Hardware:**
- **Training Environment:** NVIDIA P100 GPU
- **Training Time:** Approximately 4 hours
## How To Use
```python
import torch
from transformers import pipeline, MT5ForConditionalGeneration, MT5Tokenizer, Text2TextGenerationPipeline
# Function to translate using the pipeline
def translate_with_pipeline(text):
translator = Text2TextGenerationPipeline(model='NLPclass/mt5_en_fa_translation',tokenizer='NLPclass/mt5_en_fa_translation')
return translator(text,, max_length=128,num_beams=4)[0]['generated_text']
# Example usage
text = "Hello, how are you?"
# Using pipeline
print("Pipeline Translation:", translate_with_pipeline(text))
```
## Ethical Considerations
- The model's translations are only as good as the data it was trained on, and biases present in the training data may propagate through the model's outputs.
- Users should be cautious when using the model for critical tasks, as automatic translations can sometimes be inaccurate or misleading.
## Citation
If you use this model in your research or applications, please cite it as follows:
```bibtex
@misc{mt5_en_fa_translation,
author = {mansoorhamidzadeh},
title = {English to Persian Translation using MT5-Small},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co./mansoorhamidzadeh/mt5_en_fa_translation}},
}