Working with llama cpp ?

#1
by ivanpzk - opened

Thank you ! is it working with llama cpp ? Don't have time to try now

Quant Factory org

Yes, they’re made with llama.cpp

Q4 k_m not loading in koboldcpp and text generation web ui. No idea why. Everything up-to-date.

As far as I can tell the 128K version if not supported by Llama.cpp yet. See: https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603

Perhaps the quantization tool already supports making the GGUF files, but you cannot inference this model yet. Please correct me if I'm wrong.

Quant Factory org

Quants have been updated with the latest release for llama.cpp

munish0838 changed discussion status to closed

Sign up or log in to comment