Tips for better writing with long (<20K) context?
I noticed that when LLM reach around 20K of context, it starts to mix up characters. For example, i have narrator bot, that controls every character in the story, and when there's scenes when user is not present, like two characters talking to each other, they suddenly start to refer to user. For example: instead of "Character places a gentle hand on OtherCharacter's shoulder." it says "Character places a gentle hand on YOUR shoulder", and start to act like user is present to the scene, talking to him and describing the actions to him. I'm running Q8_0 GGUF with koboldcpp and sillytavern. Here's my generation preset, just in case.
It usually doesn't happen or rarely happen when context is small, but when it's 20K or more - it's becoming unbearable, requiring constant swipes and edits.
Edit: just tried with default settings from model card - still happens
Edit2: and sometimes, characters start acting like another characters, saying the lines that's clearly belong to another character.
That sounds about right! The general opinion with Nemo is that it starts to destabilize around ~16k tokens.
I'd like to say I can offer a magical solution for this but this is mainly a limitation imposed by the model's base training. You can extend it (only somewhat) by fine tuning with longer examples but I sadly don't have any of those, especially in the case of roleplay.
That sounds about right! The general opinion with Nemo is that it starts to destabilize around ~16k tokens.
I'd like to say I can offer a magical solution for this but this is mainly a limitation imposed by the model's base training. You can extend it (only somewhat) by fine tuning with longer examples but I sadly don't have any of those, especially in the case of roleplay.
Oh, i see. I thought Nemo can utilize up to 128k context, but i guess not. Thanks for answering.