Modified llama.cpp to generate GGUFs for Llama-3_1-Nemotron-51
#22 opened 21 days ago
by
ymcki
Documentation about the linear attention used in some layers of this model?
#21 opened 27 days ago
by
ymcki
Comparison to the 70B model?
1
#20 opened about 1 month ago
by
AIGUYCONTENT
Update README.md
#11 opened 3 months ago
by
Vlad748283847
vLLM compatible?
3
#10 opened 3 months ago
by
nickandbro
AttributeError: 'DeciLMConfig'
2
#9 opened 3 months ago
by
bluenevus
fp8 / int8 inference - use bitsandbytes or awq
#8 opened 3 months ago
by
dtanow
GGUF possible ?
2
#5 opened 3 months ago
by
gopi87
fine-tuning
#1 opened 3 months ago
by
kzmaker