how to extract model response from the output of tokenizer
#54
by
mans-0987
- opened
I can use this model successfully. I am using it in chat mode to create the prompt and get the model result as follows:
chat = [
{ "role": "user", "content": "What is the name of the first moon lander?" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=500)
print(tokenizer.decode(outputs[0]))
When the output is printed, it contains the prompt in chat format and also model generated test. How can I extract the model generated from this test?
Are there any tools in tokenizer to convert the model.generate() output to a dictionary or a JSON or similar data structure that can be easily worked with?
I meet this problem,I learn to this https://github.com/ygivenx/google-gemma/blob/main/get_started.py,it is work
You can simply use
tokenizer.decode(outputs[0, len(inputs):])
to extract the response.
Hope this helps, closing this issue as it seems resolved, thanks @hiyouga !
suryabhupa
changed discussion status to
closed