Edit model card

This is an early checkpoint of sarvam-2b, a small, yet powerful language model pre-trained from scratch on 4 trillion tokens. It is trained to be good at 10 Indic languages + English. Officially, the Indic languages supported are: Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu.

sarvam-2b will be trained on a data mixture containing equal parts English (2T) and Indic (2T) tokens. The current checkpoint has seen a total of 2 trillion tokens, and has not undergone any post-training.

Getting started:

from transformers import pipeline
pipe = pipeline(model='sarvamai/sarvam-2b-v0.5', device=0)
pipe('भारत के प्रथम प्रधानमंत्री', max_new_tokens=15, temperature=0.1, repetition_penalty=1.2)[0]['generated_text']
# 'भारत के प्रथम प्रधानमंत्री जवाहरलाल नेहरू की बेटी इंदिरा गांधी थीं।\n\n'

More technical details like evaluations and benchmarking will be posted soon.

Downloads last month
12
GGUF
Model size
2.51B params
Architecture
llama

6-bit

Inference Examples
Inference API (serverless) is not available, repository is disabled.