roberta-urdu-small
Overview
Language model: roberta-urdu-small Model size: 125M Language: Urdu Training data: News data from urdu news resources in Pakistan
About roberta-urdu-small
roberta-urdu-small is a language model for urdu language.
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="urduhack/roberta-urdu-small", tokenizer="urduhack/roberta-urdu-small")
Training procedure
roberta-urdu-small was trained on urdu news corpus. Training data was normalized using normalization module from urduhack to eliminate characters from other languages like arabic.
About Urduhack
Urduhack is a Natural Language Processing (NLP) library for urdu language. Github: https://github.com/urduhack/urduhack
- Downloads last month
- 1,485
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.