TypeError: forward() got an unexpected keyword argument 'num_logits_to_keep'
#51 opened 6 months ago
by
shajiu

Adding Evaluation Results
#50 opened 6 months ago
by
leaderboard-pr-bot

AttributeError: 'HybridMambaAttentionDynamicCache' object has no attribute '_modules'
7
#48 opened 7 months ago
by
xxrjun

Adding Evaluation Results
#47 opened 8 months ago
by
leaderboard-pr-bot

ai21 instance not runnable with langchain
1
#45 opened 9 months ago
by
LordSahu

Is there any SFT or Chat model?
2
#41 opened 11 months ago
by
chuyi777
How to use accelerate evaluate Jamba
#40 opened 11 months ago
by
Xidong

Jamba Evaluation Task on GSM8K
#39 opened 11 months ago
by
ssparks
Do you have plans to release papers on Jamba's architecture or miniature models?
#38 opened 11 months ago
by
badrabbitt
Are there any weight files for pre-trained models?
#37 opened 11 months ago
by
aidenxy
Memory usage on single A100*80GB in training
#36 opened 11 months ago
by
DavidWu1116
Fast Mamba
5
#34 opened 11 months ago
by
Praneethkeerthi
Why does throughput increase with longer context window?
3
#33 opened 11 months ago
by
jingyu-q
Request: DOI
#32 opened 11 months ago
by
kozolex

GGUF quants?
1
#31 opened 11 months ago
by
6346y9uey
Any release plans for the 7b jamba model without MoE?
2
#30 opened 11 months ago
by
danielpark
Why is there an MLP in the Mamba Layer?
#28 opened 11 months ago
by
naston
Complex vs Real parametrization.
#27 opened 11 months ago
by
Yutida
How to Fine-tune Jamba on google Colab?
7
#26 opened 11 months ago
by
Ateeqq

Layer-Selective Rank Reduction
#25 opened 11 months ago
by
mizinovmv

Update README.md
#23 opened 11 months ago
by
rombodawg

Would there a chance Jamba to be train in 1.58bit weight?
1
#22 opened 11 months ago
by
shing3232
Anyone else currently experimenting with fine-tuning Jamba?
3
#21 opened 11 months ago
by
Severian

IndentationError: unindent does not match any outer indentation level
#19 opened 12 months ago
by
thebeline
ModuleNotFoundError: No module named 'transformers_modules.ai21labs.Jamba-v0'
5
#17 opened 12 months ago
by
hjewr
Fast Mamba kernels are not available
10
#16 opened 12 months ago
by
MohamedRashad

does all safe tensors needed to be downloaded to use this model on colab?
2
#14 opened 12 months ago
by
Kv-boii
How many pretraining tokens?
#13 opened 12 months ago
by
CyberNative

Smaller version to ease implementation experiments?
7
#12 opened 12 months ago
by
compilade

Coding performance of base model?
4
#11 opened 12 months ago
by
rombodawg

Can you give a short explanation about the benefits and the architecture?
2
#7 opened 12 months ago
by
SicariusSicariiStuff

A Bang Up Job
2
#4 opened 12 months ago
by
nightvision04
multiple gpu?
3
#3 opened 12 months ago
by
bdambrosio

Just a solid congrats and thank you to your team
1
#1 opened 12 months ago
by
Severian
