bos_token_id mismatch between model config and tokenizer
I'm able to export this model to ExecuTorch following this guide https://github.com/huggingface/optimum-executorch?tab=readme-ov-file#-quick-start by replacing the model_id with "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B". When I'm running the model for inference, I got an error complaining the bos_token_id
mismatch.
File "~/optimum-executorch/optimum/executorch/modeling.py", line 351, in text_generation
raise ValueError(
ValueError: The tokenizer's bos_token_id=151646 must be the same as the model's bos_token_id=151643.
Upon checking the config files, I notice:
In config.json, the bos_token_id is 151643: https://huggingface.co./deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/config.json#L6-L7
And in tokenizer.json, the bos_token_id is 151646, same in generation_config.json: https://huggingface.co./deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/generation_config.json#L3
It seems like a bug?
BTW, if I switch to use "deepseek-ai/DeepSeek-R1-Distill-Llama-8B", it works e2e without any problem. So I think the issue is legitimate, and only shows in those using Qwen as the base model.