moonshotai/Moonlight-16B-A3B-Instruct · When running example got ValueError: Attention mask should be of size (1, 1, 1, 30), but is torch.Size([1, 1, 1, 29])

When running the given example at environment: transformers==4.48.2:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "moonshotai/Moonlight-16B-A3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

messages = [
    {"role": "system", "content": "You are a helpful assistant provided by Moonshot-AI."},
    {"role": "user", "content": "Is 123 a prime?"}
]
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(inputs=input_ids, max_new_tokens=500)
response = tokenizer.batch_decode(generated_ids)[0]
print(response)

I got the error:

ValueError: Attention mask should be of size (1, 1, 1, 30), but is torch.Size([1, 1, 1, 29])

Seems something wrong with the prepare_inputs_for_attention I guess?