meta-llama/Llama-3.1-8B-Instruct · BUG Chat template doesn't respect `add_generation

{ '<|start_header_id|>assistant<|end_header_id|>\n\n' } this always gets added to the end of the chat_template and doesn't respect add_generation_prompt=False (https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct/blob/main/tokenizer_config.json#L2053)

with transformers==4.43.2

from transformers import AutoTokenizer

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

conversation = [
    {'content': 'Hey, what up?', 'role': 'user'},
    {'content': 'Not much. I am here to help you.', 'role': 'assistant'}
]

tokenizer = AutoTokenizer.from_pretrained(model_id)
out1 = tokenizer.apply_chat_template(conversation, add_generation_prompt=False, tokenize=False)
out2 = tokenizer.apply_chat_template(conversation, add_generation_prompt=True, tokenize=False)

out1 == out2
'<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHey, what up?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nNot much. I am here to help you.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'

meta-llama
/

Llama-3.1-8B-Instruct

BUG Chat template doesn't respect `add_generation_prompt`flag from transformers tokenizer