Wow, another step in the right direction.
16bit Lyra has been my favorite of yours for awhile, but 16bit mag-well was quickly taking that seat, but this Captain-Eris_violet is a strong competitor.
Other models run into issues, like it has no problem with doing weird shit but it's like the LLM thinks it's appalling if I tell the {{char}} to kill the bad guy, like the {{char}} who literally could be a yandere serial killer will suddenly grow a conscious and lecture me about how killing is wrong, goblins are just misunderstood. Not with Captain-Eris_violet, {{char}} goes full berserk mode. I endorse!
Yeah! Nitral really cooked with this one, I'm also pretty happy with it, will spend more time with this model on personal use.
Unless you really need the BF16/full weights I am sure the Q8 is pretty much indistinguishable quality from the full size.
Thanks for the feedback!
Glad to see you guys enjoy, dropping an inverse version of this at some point using Twighlight as the base instead but in the same configuration. After that will probably be dropping some variants for testing, before i do a pixtral tune using the overall feedback from bmo, these and hathor. Also had to say hi to the boi lewd since its been a minute, so on that note - Henlo.
Happy to see ya, been a bit busy with life stuff lately but still kicking around.
That's what people say about Q8, but for me, there is a very noticeable difference in responses. Maybe on paper it's not really great of a difference, but in play, I feel like those subtle statistic variations make quiet the difference in long form. Multiply multiple output inaccuracies from Q8 adds up. For example, in Q8, I find characters degrade into generic AI bots significantly quicker, where in float 16 bots tend to adhere to their character archetype longer and are more true to their lore. Story developments seem to be more accurate as well, especially around response 50+. Float 16 story developments make more sense, whereas Q8 is more like koolaid man busting through the wall. Maybe it's just placebo for me. shrug
I would be very inclined to say it could be placebo, but here's what I'll do, I'll continue to upload the BF16/FP16 full weights since there could be demand for it and for those that want to evaluate the models at that quality. I will see that the full quants are uploaded for these too later.
And I also go through the trouble of using the FP16 weights for imatrix even though it's unnecessary and the Q8 would be more than enough, just because I can't shake that feeling from the back of my head, so I also understand you not wanting to potentially leave any quality behind, even if only theoretical. I feel you.
β€οΈ
Definitely the best lower B model I've seen for sure. Jumped to the top of my list for 12b range.