mamba / README.md
pt-sk's picture
Update README.md
3c1b667 verified
|
raw
history blame
No virus
519 Bytes
metadata
license: mit
tags:
  - mamba
  - pytorch
  - Test Generation
  - research abstract
datasets: pt-sk/research_papers_short
metrics: CrossEntropyLoss

This model using Mamba Architecuture trained on a research abstract dataset.

  • Optimizer: AdamW
  • Leanring Rate: 0.001

Import the scripts from the code folder

from model import Mamba, ModelArgs

Loading Model

mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")

Loading Tokenizer

tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')