trick for less token usage and less halucination

#19
by gopi87 - opened

in text generation ui just update the lama cpp manually add this in the chat ui

Start reply with

lets plan the steps and review the steps

for better response
*dont use flash attention

  • big thanks to qwen team

even in this way you make model to follow your own custom prompt

This comment has been hidden

Start reply with

lets plan it one by one

very organized response like o1

Sign up or log in to comment