Added q5_1 quant

#1
by concedo - opened

Here's a q5_1 quantized version of the WavTokenizer, had to make a few tweaks to quantize it correctly, as there are some tensors with incompatible shapes. But it works correctly with the current version on llama.cpp, and is a somewhat smaller. Feel free to try it.

ggerganov changed pull request status to merged

Sign up or log in to comment