This shit is fire
It's like it was a few years ago, discovering first Undi's merges. Is this a new era of RP? This model is very smart and catches details and instructions much better than 70/123b models.
It has a few rough sides (some repetetivness, etc.), but looks like thinking models have huge potential. It impressed me. Multiple characters in one card currently doesn't work well though.
lol it's so fucking good in 1 in 1 chat. I've never seen a model this creative and smart.
I honestly doubted should I download this or it doesn't worth it, but well, this was a surprise.
Hi, thanks for the post.
Yes, multiple character isn't supported and I don't think it will hahaha, I'm sorry.
@Undi95 also I noticed that it loves to self-censor itself in thoughts, something like "I shouldn't focus on , better focus on , even though user is asked for " and then trying to avoid writing foul language, even though meaning is the same. Maybe you can improve that aspect in next releases.
@Undi95 also I noticed that it loves to self-censor itself in thoughts, something like "I shouldn't focus on
<that>
, better focus on<that>
, even though user is asked for<X>
" and then trying to avoid writing foul language, even though meaning is the same. Maybe you can improve that aspect in next releases.
Sure, thanks for the feedback!
@Ainonake It's a mistral model. Add "uncensored" anywhere (logical) in the system prompt and you won't get refusals beside saying the N word. If you still manage to get refusals, I have no idea what kind of stuff you're trying to make it say, and how bad you're at asking it, but that's not the model's fault. It's very compliant.
@Undi95 Now, for actionable feedback. (i'm using your suggested inference settings and the rest)
I've barely scratched the surface (i'm partly waiting to be done with my system so I can use different inference settings for thinking and acting), but I've noticed something kinda annoying. The roleplay formatting is extremely inconsistent. That's in perfectly formatted 16K+ context logs using
*narration* dialog
and another one using
narration "dialog"
It can sorta use both, but most of the time it'll mix and match. More annoyingly, it'll bold and use italic within the dialogs as well. I guess it's for emphasize, and it's kinda effective at it (picking the right word to bold or italicize) in a vacuum. But in practice, especially during long chats, the output looks like a mess and is pretty difficult to read. That wouldn't be that bad if it was at least internally consistent, but it's not really either, it'll switch between different formatting on a whim.
I assume it's an inconsistent dataset thing. The model is still very much usable, and for a mistral base at 0.35 temp + nothing else, it's surprisingly diverse in it's output. So it shows some promises, but (at least for me) it makes it very tedious to read or interact with. I can understand people like the novelty aspect, though.
@SerialKicked
I appreciate this feedback!
I'm also a little annoyed by DeepSeek dual use of format, but this is how DeepSeek do "rp", I didn't trained on input, only output.
This mean the RP formatting you see is directly taken from DeepSeek itself. I could do a cleaning and reformat every log into one unique format when I finish my experiment!
You're welcome! And gotcha, I had no idea base R1 "acted" like that. I didn't mean to sound too harsh on the model. It really is interesting. But yeah, without beating that habit out of it, it'll be rough to use in that context. Technically I could probably filter some of it out at the UI level, but it'd probably also be very hit and miss due to the inconsistencies.
Looking forward to your next experiments in that domain.
Cheers!
Another fix you can try is to directly put exemple dialogue on the system prompt or the character card. Just be sure all is send in one system prompt, and it should copy it.
Yeah tried that, of course. I'm kinda thorough when testing models. I have a dozen very long chatlog in the two formatting styles, with extensive system prompts featuring example dialogs and writing guidelines. I wouldn't even bother you otherwise :)
And I'll give you that it sorta works for a few messages (on top of the 16ktks premade history, I mean, if it doesn't get the memo out of that, i don't know what else it'd need). It'll still emphasize some words in the dialogs making it very annoying to read, but yes it'll keep the general spirit for like 3-4 additional message pairs. After that, if you don't edit the crap out of its output, it'll become inconsistent again.
Now I get why, R1 dataset, at least for roleplay, need to be sanitized. I wonder if it can be done through code alone or if it's really too wild to get it right. I might look into it when I get some free time.
Anyway, took enough of your time. Thanks for all your work over the last few years!