mamba / README.md
pt-sk's picture
Update README.md
3c1b667 verified
|
raw
history blame
No virus
519 Bytes
---
license: mit
tags:
- mamba
- pytorch
- Test Generation
- research abstract
datasets: pt-sk/research_papers_short
metrics: CrossEntropyLoss
---
This model using Mamba Architecuture trained on a research abstract dataset.
* Optimizer: AdamW
* Leanring Rate: 0.001
Import the scripts from the code folder
```
from model import Mamba, ModelArgs
```
Loading Model
```
mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")
```
Loading Tokenizer
```
tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')