Model config.json has Mistral params instead of Mixtral, breaking ExLlama quants and maybe affecting others too

#3
by TheBloke - opened

I got reports that ExLlamav2 wasn't working with this GPTQ. Turns out that's because it's trying to load it as a Mistral model, which is due to the architecture being set to Mistral instead of Mixtral

Also, the rope_theta should be 1000000.0 for Mixtral - this can affect inference quality.

I don't think any of this would stop k-quants working though, so that issue might be unrelated. I'll try making some anyway though.

Undi95 changed pull request status to merged

Sign up or log in to comment