Hugging Face: Banglish to Bangla Translation

This repository demonstrates how to use a Hugging Face model to translate Banglish (Romanized Bangla) text into Bangla using the MBart50 tokenizer and model. The model, Mdkaif2782/banglish-to-bangla, is pre-trained and fine-tuned for this task.

Setup in Google Colab

Follow these steps to use the model in Google Colab:

1. Install Dependencies

Make sure you have the transformers library installed. Run the following command in your Colab notebook:

!pip install transformers torch

2. Load and Use the Model

Copy the code below into a cell in your Colab notebook to start translating Banglish to Bangla:

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
import torch

# Load the pre-trained model and tokenizer directly from Hugging Face
model_name = "Mdkaif2782/banglish-to-bangla"
tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
model = MBartForConditionalGeneration.from_pretrained(model_name)

def translate_banglish_to_bangla(model, tokenizer, banglish_input):
    inputs = tokenizer(banglish_input, return_tensors="pt", padding=True, truncation=True, max_length=128)

    if torch.cuda.is_available():
        inputs = {key: value.cuda() for key, value in inputs.items()}
        model = model.cuda()

    translated_tokens = model.generate(**inputs, decoder_start_token_id=tokenizer.lang_code_to_id["bn_IN"])
    translated_text = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]

    return translated_text

# Take custom input
print("Enter your Banglish text (type 'exit' to quit):")
while True:
    banglish_text = input("Banglish: ")
    if banglish_text.lower() == "exit":
        break

    # Translate Banglish to Bangla
    translated_text = translate_banglish_to_bangla(model, tokenizer, banglish_text)
    print(f"Translated Bangla: {translated_text}\n")

3. Run the Notebook

  1. Paste the above code into a cell.
  2. Run the cell.
  3. Enter your Banglish text in the input prompt to get the translated Bangla text. Type exit to quit.

Example Usage

Input:

Banglish: amar valo lagche onek

Output:

Translated Bangla: আমার ভালো লাগছে অনেক

Notes

  • Ensure your runtime in Google Colab supports GPU for faster processing. Go to Runtime > Change runtime type and select GPU.
  • The model Mdkaif2782/banglish-to-bangla can be fine-tuned further if required.

License

This project uses the Hugging Face transformers library. Refer to the Hugging Face documentation for more details.

Downloads last month
59
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Mdkaif2782/banglish-to-bangla

Finetuned
(134)
this model

Dataset used to train Mdkaif2782/banglish-to-bangla