The version beyond Q4 quantization is completely unavailable

by yuiaa001 - opened Jun 10

Discussion

yuiaa001

Jun 10

The Q4_K_M version performed well, but all versions higher than Q4 did not answer as expected.

yuiaa001

Jun 10

I am running through ollama 0.1.41.

mradermacher

Owner Jun 10

Can you explain what "unavailable" means?

mradermacher

Owner Jun 10

And what answer do you get, and what do you expect? As such, this posting is pretty useless.

mradermacher

Owner Jun 10

Just tried out the Q6_K and it works fine. Make sure you downloaded the files correctly and fully, and maybe consult a support forum for ollama on how to set it up.

mradermacher changed discussion status to closed Jun 10

yuiaa001

Jun 11

Just tried out the Q6_K and it works fine. Make sure you downloaded the files correctly and fully, and maybe consult a support forum for ollama on how to set it up.

They're just throwing out random outputs, and only Q4 can answer the question correctly

yuiaa001

Jun 11

Just tried out the Q6_K and it works fine. Make sure you downloaded the files correctly and fully, and maybe consult a support forum for ollama on how to set it up.

Are you using this model through ollama?

mradermacher

Owner Jun 11

No, I use llama.cpp, which ollama also uses (but likely an older version).

yuiaa001

Jun 11