4bit GPTQ version?

#1
by nacs - opened

Thanks for training this dataset. I'd love to try it but was curious if you could upload a 4-bit quantized version of this model?

(It would take a lot less space and VRAM). Thanks!

Thanks for training this dataset. I'd love to try it but was curious if you could upload a 4-bit quantized version of this model?

(It would take a lot less space and VRAM). Thanks!

Yes I'll do so right now

Would it be possible to also get a version that works with llama.cpp (https://github.com/ggerganov/llama.cpp)?

Would it be possible to also get a version that works with llama.cpp (https://github.com/ggerganov/llama.cpp)?

I'm sorry I've never worked with it, so I'm not quite sure how to

Thanks for training this dataset. I'd love to try it but was curious if you could upload a 4-bit quantized version of this model?

(It would take a lot less space and VRAM). Thanks!

I'm uploading a version 2 of the RP Lora merged with GPT4-x-Alpaca model in 4bit quantization right now, you can get it here:
https://huggingface.co./teknium/GPT4-x-Alpaca13b-RolePlayLora-4bit-v2

Would it be possible to also get a version that works with llama.cpp (https://github.com/ggerganov/llama.cpp)?

I'm sorry I've never worked with it, so I'm not quite sure how to

I can take a crack at it. It's for compatibility with Llama.cpp

Sign up or log in to comment