This looks interesting

#1
by Danioken - opened

I'm a fan of your previous modifications to this model. "Psyonic-Cetacean-Ultra-Quality-20b-GGUF-imat-plus2" is amazing. I am using the Q6_K version, and the version "Psyonic-Cetacean-20B-Ultra-NEO-X-BETA-IQ4_XS-imat13.gguf" I think it can actually match it.

Are you planning to make a neo IQ6 version? The effect could be really great.

There is also an interesting thing I noticed, your NEO models use a lot of memory for a while during initialization (sudden peak) and then reduce it. It may be the fault of my configuration, I use "LM Studio rocm" and "koboltcpp-rocm". Ahh, these AMD and Radeons... Because of this, I cannot load certain models, even though they can work in the "standard" version.

Owner

Thank you!
As for "IQ6" ; this would be a Q6 ... however the "X" quants are a special breed ("BETA") are still being developed.
REASON: The behavior of "X" quants varies by quant and by model.
In the case of PSYCET , the "BETA" is specific type of "X" quant for the model. (based on merge type, and layer locations)

RE memory usage.
This is a side effect of "IQ" quants ;
Time permitting more quants will be uploaded as they are tested and meet requirements.

Sign up or log in to comment