apply_chat_template(add_generation_prompt=True) not working

#9
by odellus - opened

Not able to get tokenizer.apply_chat_template to append the generation prompt for stablelm-zephyr-3b

print(tokenizer.chat_template)
"{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + eos_token }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}"

chat = [{'role': 'system', 'content': 'You are an excellent C++ programmer'}, {'role': 'user', 'content': 'Write a program to compute pairwise distances between atoms in a PDB file'}]

tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
'<|system|>\nYou are an excellent C++ programmer<|endoftext|>\n<|user|>\nWrite a program to compute pairwise distances between atoms in a PDB file<|endoftext|>\n'

tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=False)
'<|system|>\nYou are an excellent C++ programmer<|endoftext|>\n<|user|>\nWrite a program to compute pairwise distances between atoms in a PDB file<|endoftext|>\n'

Could this be an issue with tokenizer module? The chat template looks right.

Upgrading transformers fixed this issue for me. Closing.

odellus changed discussion status to closed

Sign up or log in to comment