I'm unable to make these models work
#8
by
hdnh2006
- opened
Hello,
I have tried both 7b and 2b models and I am unable to correctly make them work. I have tried your official GPTQ quantization, I quantized myself a GGUF quantization and the results are poor.
With your GPTQ quantization
With my GGUF quantization
I have made a YouTube video (in spanish) about it in case you want to check it: https://youtu.be/CPkvcREEMc8
Without quantization
I have a RTX 4060 16GB, so I tried the 2b parameter at full precsion. I don't have enough hardware for the 7b parameter in full precision and the results are still the same
is there any change your provide the benchmarks?
Thanks in advance.