Edit model card

DeciMamba Checkpoint (Baseline)

The official checkpoint of Mamba-130m, finetuned for Language Modeling over the PG-19 dataset as presented in DeciMamba: Exploring the Length Extrapolation Potential of Mamba.

See our Github Repo for evalution and training scripts.

Bibtex:

@misc{benkish2024decimambaexploringlengthextrapolation,
      title={DeciMamba: Exploring the Length Extrapolation Potential of Mamba}, 
      author={Assaf Ben-Kish and Itamar Zimerman and Shady Abu-Hussein and Nadav Cohen and Amir Globerson and Lior Wolf and Raja Giryes},
      year={2024},
      eprint={2406.14528},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2406.14528}, 
}
Downloads last month
12
Inference Examples
Inference API (serverless) has been turned off for this model.