Problem Model

#1
by ClaudioItaly - opened

this model is broken. He starts making endless signs. I've tried every way, I give up
2024-07-02_225537.png

It's not broken for me, you need the last llama.cpp.

What he said ^

And if you're still having issues, make sure flash attention is off

What he said ^

And if you're still having issues, make sure flash attention is off

Lm Studio updated the ccp yesterday and the problem remains even without flah

Same. same. broken ggufs

I'm not seeing any issues either on lmstudio 0.2.27, can you guys share your hardware and settings?

Tested on windows with a 3070 and linux with a 3090

No issue here and i even tried it in french for a full 8k tokens chat.
windows 10 with a 3090 ti llama.cpp + sillytavern as a front end and also tried with kobold.cpp (q8 quant)

I tried this https://huggingface.co./legraphista/Gemma-2-9B-It-SPPO-Iter3-IMat-GGUF and it works fine, so idk what is wrong with these ggufs, I am using koboldcpp latest. And for me, it doesn't even load the model, errors occur.

That's very interesting since that quant from @legraphista (tagged so you can consider updating) is using a version of llama.cpp with a broken Gemma 2 implementation, so your experience should only be better on mine. What's the error you get @AndrewLockhart ?

Thanks for the tag @bartowski . I'll hold on from updating until we understand why my broken version works in @AndrewLockhart 's setup

Yeah good call, need more details ..

I re-downloaded one gguf from this repo and it works fine, maybe I just had an old version of the file. So everything is fine now.

Ah good good that is the best case scenario haha

Thanks for letting us know! I'll be re-processing my repo to apply the fixes

Sign up or log in to comment