Regression vs original

#3
by MoonRide - opened

Regression I've noticed vs original gemma during initial tests (original model didn't fail). It happens like once or twice per 10 attempts, like that:

image.png

image.png

Launched using llama-server.exe -v -ngl 99 -m gemma-2-9b-it-Q6_K.gguf, setup as below:

image.png

https://huggingface.co./BeaverAI?search_models=tiger-gemma-9b-v2

v2a might have the same issue as v1, but the other versions should fare better.

@TheDrummer Okay, will check those out. Btw. I just started playing with Big Tiger v1 (Big-Tiger-Gemma-27B-v1-IQ4_XS.gguf from https://huggingface.co./bartowski/Big-Tiger-Gemma-27B-v1-GGUF), and I see same problem there (while same quant from original always gives correct answer).

UPDATE: I tested Tiger-Gemma-9B-v2g-Q6_K.gguf from https://huggingface.co./BeaverAI/Tiger-Gemma-9B-v2g-GGUF, and it still sometimes fails. Also 2g is refusing much more alike the original version.

PS I really like the idea of uncensored models being as smart as the original, without messing them up with intense finetuning - just given ability to treat adults like adults. For L3 pretty nice approach like that was https://huggingface.co./vicgalle/Configurable-Llama-3-8B-v0.3 from @vicgalle (model learning how to follow range of system prompts) - maybe something like that could work for Gemma 2 series, too?

Is it even a surprise? I don't remember any model that is based on something like llama3, gemma2 etc, and is not worse than the original. At least at reasoning...

Is it even a surprise? I don't remember any model that is based on something like llama3, gemma2 etc, and is not worse than the original. At least at reasoning...

Not even a mixture of models ? Have you tested Lunaris ?

Sign up or log in to comment