File size: 595 Bytes
41ef15a
 
 
 
 
 
 
 
7125e4b
41ef15a
d39159a
f9403e3
3c1b667
 
7125e4b
f9403e3
 
7125e4b
 
 
 
 
06776d0
 
 
7125e4b
 
f9403e3
06776d0
7125e4b
 
d39159a
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: mit
tags:
- mamba
- pytorch
- Test Generation
- research abstract
datasets: pt-sk/research_papers_short
metrics: CrossEntropyLoss
---
This model uses Mamba Architecture trained on a research abstract dataset. 

* Optimizer: AdamW
* Leanring Rate: 0.001



Import the scripts from the code folder
```
from model import Mamba, ModelArgs
```


Loading Model
```
mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")
```


Loading Tokenizer
```
tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')
```

mamba_reserach file contains the state dict of optimizer and the model.