BUG Chat template doesn't respect `add_generation_prompt`flag from transformers tokenizer

#44
by ilu000 - opened

{ '<|start_header_id|>assistant<|end_header_id|>\n\n' } this always gets added to the end of the chat_template and doesn't respect add_generation_prompt=False (https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct/blob/main/tokenizer_config.json#L2053)

with transformers==4.43.2

from transformers import AutoTokenizer

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

conversation = [
    {'content': 'Hey, what up?', 'role': 'user'},
    {'content': 'Not much. I am here to help you.', 'role': 'assistant'}
]

tokenizer = AutoTokenizer.from_pretrained(model_id)
out1 = tokenizer.apply_chat_template(conversation, add_generation_prompt=False, tokenize=False)
out2 = tokenizer.apply_chat_template(conversation, add_generation_prompt=True, tokenize=False)
out1 == out2
'<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHey, what up?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nNot much. I am here to help you.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'

Came here to post this! This is a big issue, because any finetuning that uses apply_chat_template will teach the model an incorrect output format, because there are two assistant header tags.

Sign up or log in to comment