Working with llama cpp ?
#1
by
ivanpzk
- opened
Thank you ! is it working with llama cpp ? Don't have time to try now
Yes, they’re made with llama.cpp
Q4 k_m not loading in koboldcpp and text generation web ui. No idea why. Everything up-to-date.
As far as I can tell the 128K version if not supported by Llama.cpp yet. See: https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603
Perhaps the quantization tool already supports making the GGUF files, but you cannot inference this model yet. Please correct me if I'm wrong.
Quants have been updated with the latest release for llama.cpp
munish0838
changed discussion status to
closed