|
--- |
|
license: mit |
|
datasets: |
|
- VishnuPJ/Malayalam_CultureX_IndicCorp_SMC |
|
library_name: transformers |
|
language: |
|
- ml |
|
tags: |
|
- mamba |
|
- ssm |
|
- s6 |
|
- jamba |
|
- llm |
|
- state space models |
|
- malayalam |
|
- indic |
|
--- |
|
|
|
![Thumbnail](thumbnail.jpg) |
|
|
|
# Ma-layala-mba |
|
|
|
Welcome to Ma-layala-mba, a base Indic language model designed to push the boundaries of NLP for Indian languages. It is based on the Mamba series of state space models. |
|
|
|
## Model Description |
|
|
|
Ma-layala-mba is a state-of-the-art S6 SSM model specifically crafted for the South Indian regional and state language of Kerala: Malayalam. It integrates traditional Attention mechanisms with innovative approaches such as MLPs and State Space Models (SSMs) to handle complex linguistic features and achieve high accuracy in language understanding and generation. |
|
|
|
- **Model Type**: A 128M Jamba model finetuned on ~1.5M samples of Malayalam prompt-response pairs from a subset of the IndicCorop Dataset |
|
- **Language(s)**: Malayalam |
|
- **License**: GNU General Public License v3.0 |
|
- **Training Precision**: bfloat16 |
|
|
|
## Example Usage |
|
|
|
Here's a quick example to get you started with the Ma-layala-mba model: |
|
|
|
```python |
|
from transformers import MaLayalaMbaForCausalLM, AutoTokenizer, pipeline |
|
|
|
model = MaLayalaMbaForCausalLM.from_pretrained( |
|
"aoxo/Ma-layala-mba_Tiny_128M", |
|
# load_in_8bit=True, # Set this depending on the GPU you have |
|
torch_dtype=torch.bfloat16, |
|
device_map={"": 0}, # Set this depending on the number of GPUs you have |
|
local_files_only=False # Optional |
|
) |
|
model.eval() |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("aoxo/Ma-layala-mba_Tiny_128M") |
|
|
|
input_ids = tokenizer("മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക", return_tensors='pt').to(model.device)["input_ids"] |
|
|
|
outputs = model.generate(input_ids, max_new_tokens=100) |
|
|
|
print(tokenizer.batch_decode(outputs)) |
|
``` |
|
|
|
### Example Output: |
|
|
|
``` |
|
മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക |
|
|
|
a. വലിയ - __________ |
|
b. രസം - __________ |
|
c. സുഖം - __________ |
|
d. പ്രകാശം - __________ |
|
e. വേഗം - __________ |
|
``` |
|
|
|
## Usage Note |
|
|
|
Please be aware that this model has not undergone comprehensive detoxification or censorship. While it exhibits strong linguistic capabilities, there is a possibility of generating content that may be deemed harmful or offensive. We advise users to apply discretion and closely monitor the model's outputs, especially in public or sensitive settings. |
|
|
|
## Meet the Developers |
|
|
|
- **Alosh Denny** |