File size: 2,665 Bytes
676dbde 38eaca8 676dbde e892cf1 676dbde 61895fb 676dbde 61895fb 676dbde 36f6eba 676dbde 38eaca8 676dbde 38eaca8 676dbde b1547c3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
license: mit
datasets:
- VishnuPJ/Malayalam_CultureX_IndicCorp_SMC
library_name: transformers
language:
- ml
tags:
- mamba
- ssm
- s6
- jamba
- llm
- state space models
- malayalam
- indic
---
![Thumbnail](thumbnail.jpg)
# Ma-layala-mba
Welcome to Ma-layala-mba, a base Indic language model designed to push the boundaries of NLP for Indian languages. It is based on the Mamba series of state space models.
## Model Description
Ma-layala-mba is a state-of-the-art S6 SSM model specifically crafted for the South Indian regional and state language of Kerala: Malayalam. It integrates traditional Attention mechanisms with innovative approaches such as MLPs and State Space Models (SSMs) to handle complex linguistic features and achieve high accuracy in language understanding and generation.
- **Model Type**: A 128M Jamba model finetuned on ~1.5M samples of Malayalam prompt-response pairs from a subset of the IndicCorop Dataset
- **Language(s)**: Malayalam
- **License**: GNU General Public License v3.0
- **Training Precision**: bfloat16
## Example Usage
Here's a quick example to get you started with the Ma-layala-mba model:
```python
from transformers import MaLayalaMbaForCausalLM, AutoTokenizer, pipeline
model = MaLayalaMbaForCausalLM.from_pretrained(
"aoxo/Ma-layala-mba_Tiny_128M",
# load_in_8bit=True, # Set this depending on the GPU you have
torch_dtype=torch.bfloat16,
device_map={"": 0}, # Set this depending on the number of GPUs you have
local_files_only=False # Optional
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained("aoxo/Ma-layala-mba_Tiny_128M")
input_ids = tokenizer("മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക", return_tensors='pt').to(model.device)["input_ids"]
outputs = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.batch_decode(outputs))
```
### Example Output:
```
മലയാളം പര്യായപദങ്ങളിൽ ഒരു പരീക്ഷ പേപ്പർ ഉണ്ടാക്കുക
a. വലിയ - __________
b. രസം - __________
c. സുഖം - __________
d. പ്രകാശം - __________
e. വേഗം - __________
```
## Usage Note
Please be aware that this model has not undergone comprehensive detoxification or censorship. While it exhibits strong linguistic capabilities, there is a possibility of generating content that may be deemed harmful or offensive. We advise users to apply discretion and closely monitor the model's outputs, especially in public or sensitive settings.
## Meet the Developers
- **Alosh Denny** |