need gguf support

#2
by huntz47 - opened

anyone from apple team are seeing this. please add a gguf format for this model .
thank you

I want that too

Hey hey @huntz47 & @sdyy - sorry for the delay in response. OpenELM is supported in llama.cpp!

I created some quants for the instruct models here:
450M - https://huggingface.co./reach-vb/OpenELM-450M-Instruct-Q8_0-GGUF
1.1B - https://huggingface.co./reach-vb/OpenELM-1_1B-Instruct-Q8_0-GGUF
3B - https://huggingface.co./reach-vb/OpenELM-3B-Instruct-Q8_0-GGUF

Note: I found quite a bit of degradation below Q8, but if you want to create other quants then feel free to use GGUF-my-repo space: https://huggingface.co./spaces/ggml-org/gguf-my-repo for it.

The inference instructions are in the model cards. Enjoy! and do let me know if you have any questions!

Sign up or log in to comment