Suggested tokenizer changes similar to Phi-4
#8
by
l2dy
- opened
tokenizer_config.json
for Phi-4-mini-instruct also contains the following,
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"pad_token": "<|endoftext|>",
which was changed in Phi-4 to use different strings. https://huggingface.co./microsoft/phi-4/commit/6fbb3d3bbe726c99b4188087b4deeec1bceac5ae
Does it make sense to apply similar changes to Phi-4-mini-instruct?