Need a 7B Q2_K.gguf

by boqsc - opened Dec 9, 2023

Discussion

boqsc

Dec 9, 2023

Please make Q2_K gguf.

lemonilia

Owner Dec 9, 2023

What's the use case for a Q2_K 7B model? Wouldn't performance degradation be extreme?

boqsc

Dec 9, 2023

•

edited Dec 9, 2023

That's true, but sometimes it's not too bad and still produce decent answers.
Q2_K 7B is simply faster at local inference.
That's the main case. Also smaller in size and less RAM usage.

I also experienced that Q2_K sometimes provide with more raw and more random output that are not long boring answers.

The last case is to compare for myself and see if it's better at all that than Herman one:
https://huggingface.co./TheBloke/dolphin-2.2.1-AshhLimaRP-Mistral-7B-GGUF

lemonilia

Owner Dec 9, 2023

Uploading it now; it should be online soon. Since I have limited upload bandwidth and planned to update the repository on a regular basis at least for the next few weeks, I wanted to avoid having to make too many versions.

lemonilia changed discussion status to closed Dec 9, 2023

lemonilia changed discussion status to open Dec 9, 2023

lemonilia

Owner Dec 9, 2023

The last case is to compare for myself and see if it's better at all that than Herman one:
https://huggingface.co./TheBloke/dolphin-2.2.1-AshhLimaRP-Mistral-7B-GGUF

The current version of Limamono has been finetuned on the base Mistral-7B model, by the way. So, I expect it not to follow general instructions very well, unless they are "roleplayed" in the specified novel/book/forum RP format.

lemonilia changed discussion status to closed Dec 10, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment