https://huggingface.co./mistralai/Mistral-Nemo-Instruct-2407

#148
by WOOSAH - opened

Please add gguf for this.
would also love to see imatrix quants too <3 <3 <3

How I love gated models. Anyway, it's queued and should be done in a few hours or so. Cheers!

mradermacher changed discussion status to closed

Thank you thank you thank you, much love!

Unfortunately, it turns out to not be supported by llama.cpp at this time (due to lack of pretokenizer support). Sorry :/

This should now be supported in latest llama.cpp (b3542). I ran convert_hf_to_gguf.py and performed Q5_K_M quantization myself and everything worked perfectly fine.

Let's give it another try then :)

Sign up or log in to comment