Leave your feedback here

#2
by Gryphe - opened
Owner

You can use this discussion to provide feedback. Thanks in advance!

First impression of the version of this model:

  1. Less verbose than pure 1.6.
  2. Slightly worse adherence to additional instructions. Example: `the character's inner thoughts'.
  3. Good at navigating space, but describes it much worse than 1.6.
  4. The character I tested the model on behaved more interestingly in 1.6 than in this version, was more emotionally expressive. In this version, he's more boring, it's hard to understand his motivations, and the lack of description of the environment makes the action feel like it's taking place in a vacuum.

Disclaimer: I'm testing on the sampler settings you specified and with the minimal basic SillyTavern - ChatML instructions, which would avoid unnecessary clutter and allow me to see a clean model run without additional hints.

Owner

Some excellent observations there, thank you so much.

I've been testing some more myself and I suspect that the increased diversity from the KTO training had an adverse effect on the model's understanding.

Yes, you're right, the model has comprehension issues, she sometimes mistakes my dialogs for the character's dialogs.

Yeah, as above. It tends to confuse things a bit more - subjects, concepts, etc.

We can't have that! I'll be certain to get to the bottom of this the coming days to see what caused this - My main suspect are the added Opus writing prompts, so I might retry a KTO run without them.

I prioritize understanding over diversity, as nothing sucks as much as not having the model understand what you're trying to say.

EDIT: Do mention if you find any improvements, as well, if any. ;-)

Perhaps my review will help you, I tested 4 models on NeMo from different authors.

  1. NeverSleep/Lumimaid-v0.2-12B-GGUF - seemed very chaotic and unstable to me, although it is very popular, I personally did not like it.
  2. Sao10K/MN-12B-Lyra-v3 - this model is strange, the behavior is like a person with a heart attack and stroke at the same time, it feels like she is crazy and she was taught on the texts of cheap stand-up comedians. She writes and soaps very beautifully, but all this is in huge tons of garbage and nonsense, communicating with her is like looking for diamonds in manure.
  3. Undi95/Lumimaid-Magnum-12B-GGUF - one of my favorites, VERY smart, logical and holds instructions well.
  4. And your Model for me is approximately equal to the 3rd, but has a greater depth of thinking.
Owner

I'm cooking up another KTO train right now, minus the writing prompts. I now know what to look for during testing, thanks to the feedback from this thread.

If anyone would like to test an early release GGUF at some point please let me know your preferred quant level - I'll put those in my in my quant repo once I have something workable.

I wonder if there's any way for the model to better learn and separate out different subjects and their attributes more accurately - I find that the models that can do that and maintain those differences are generally more intelligent too.

Edit: I find it somewhat perplexing to most models when characters adopt a different name or identity, or transforms.

I wonder if there's any way for the model to better learn and separate out different subjects and their attributes more accurately - I find that the models that can do that and maintain those differences are generally more intelligent too.

Edit: I find it somewhat perplexing to most models when characters adopt a different name or identity, or transforms.

Try dabbling with Sao10K/MN-12B-Lyra-v1, in my opinion it's one of the best models out there right now with incredible attention to detail and character personality on minimum settings. Unfortunately, version 3 was in an accident and the model just went crazy!
Also try MarinaraSpaghetti/NemoMix-Unleashed-12B-GGUF, it is less detailed but also very nice in communication and behavior. Both of these LLM models behave consistently and don't attack you like March cats.

Personal opinion: if you combine Lyra v1 with its attention to detail and prose and Pantheon with its emotions and internal dialogs, you get a 2nd MythoMax, but 1000 times better. And if the model is blown up to at least 12*2, I'll be praying for this version of LLM!

Sign up or log in to comment