Does this work for anyone?
15K downloads, but with the PR all this model does is error out for me, no matter if I use one of these quants or quant it myself.
As written in the readme π, the model was generated with a branch of llama.cpp that intended to add support for GLM4/GLM3.
Support was merged in master 3 days ago https://github.com/ggerganov/llama.cpp/pull/8031
If the model still doesn't work after you've updated llama.cpp, ping me and I'll do a re-quant.
The PR suggests the 1M version of the model isn't even supported yet, lol.
https://github.com/ggerganov/llama.cpp/pull/8031#issuecomment-2213635819
It doesn't work when I quantize it myself either.
llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_qkv.weight' has wrong shape; expected 4096, 4608, got 4096, 5120, 1, 1
llama_load_model_from_file: failed to load model
I tried Q4, Q6, Q8 and F16 on Ollama, LlamaCpp and KoboldCpp. Always the same error message.
Ok, thanks for the heads-up! I will add a notice to the readme, and do a re-quant once support is implemented