Template tokens

#1
by notafraud - opened

The description says "ChatML-ified, with no additional tokens introduced.", yet you introduce <|im_end|> which doesn't exist in the original Mistral 24b. In the example later, you show Model instruction template: ChatML, so it does use ChatML template tokens.

What do you mean by "no additional tokens introduced" then?

I changed one of the special tokens (20, 21, 22) but forgot to do so in one of the parts I trained (merged different base tunes that used different datasets). Will look into it.

SicariusSicariiStuff changed discussion status to closed

Thanks for the answer. Honestly, I would suggest keeping original tokens only and format in ChatML-way:

[SYSTEM_PROMPT]
{system_prompt}[/SYSTEM_PROMPT]
[INST]
{prompt}[/INST]
{response}

or something similar. I believe it was done before with Mistral 7b finetunes, and now we finally have defined tokens for system prompt.

This would give more readable structure while keeping original tokens.

Sign up or log in to comment