--- library_name: transformers license: mit language: - fa tags: - persian - mt5-small - mt5 - persian translation - seq2seq - farsi --- # Model Card: English to Persian Translation using MT5-Small ## Model Details **Model Description:** This model is designed to translate text from English to Persian (Farsi) using the MT5-Small architecture. MT5 is a multilingual variant of the T5 model, pretrained on a diverse set of languages. **Intended Use:** The model is intended for use in applications where automatic translation from English to Persian is required. It can be used for translating documents, web pages, or any other text-based content. **Model Architecture:** - **Model Type:** MT5-Small - **Language Pair:** English (en) to Persian (fa) ## Training Data **Dataset:** The model was trained on a dataset consisting of 100,000 parallel sentences of English and Persian text. The data includes various sources to cover a wide range of topics and ensure diversity. **Data Preprocessing:** - Text normalization was performed to ensure consistency. - Tokenization was done using the SentencePiece tokenizer. ## Training Procedure **Training Configuration:** - **Number of Epochs:** 4 - **Batch Size:** 8 - **Learning Rate:** 5e-5 - **Optimizer:** AdamW **Hardware:** - **Training Environment:** NVIDIA P100 GPU - **Training Time:** Approximately 4 hours ## How To Use ```python import torch from transformers import pipeline, MT5ForConditionalGeneration, MT5Tokenizer, Text2TextGenerationPipeline # Function to translate using the pipeline def translate_with_pipeline(text): translator = Text2TextGenerationPipeline(model='NLPclass/mt5_en_fa_translation',tokenizer='NLPclass/mt5_en_fa_translation') return translator(text,, max_length=128,num_beams=4)[0]['generated_text'] # Example usage text = "Hello, how are you?" # Using pipeline print("Pipeline Translation:", translate_with_pipeline(text)) ``` ## Ethical Considerations - The model's translations are only as good as the data it was trained on, and biases present in the training data may propagate through the model's outputs. - Users should be cautious when using the model for critical tasks, as automatic translations can sometimes be inaccurate or misleading. ## Citation If you use this model in your research or applications, please cite it as follows: ```bibtex @misc{mt5_en_fa_translation, author = {mansoorhamidzadeh}, title = {English to Persian Translation using MT5-Small}, year = {2024}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co./mansoorhamidzadeh/mt5_en_fa_translation}}, }