Template tokens
The description says "ChatML-ified, with no additional tokens introduced.", yet you introduce <|im_end|>
which doesn't exist in the original Mistral 24b. In the example later, you show Model instruction template: ChatML
, so it does use ChatML template tokens.
What do you mean by "no additional tokens introduced" then?
I changed one of the special tokens (20, 21, 22) but forgot to do so in one of the parts I trained (merged different base tunes that used different datasets). Will look into it.
Thanks for the answer. Honestly, I would suggest keeping original tokens only and format in ChatML-way:
[SYSTEM_PROMPT]
{system_prompt}[/SYSTEM_PROMPT]
[INST]
{prompt}[/INST]
{response}
or something similar. I believe it was done before with Mistral 7b finetunes, and now we finally have defined tokens for system prompt.
This would give more readable structure while keeping original tokens.