Prompt format

by sam-paech - opened Apr 12, 2024

Discussion

sam-paech

Apr 12, 2024

Was this trained with a specific prompt format?

mrjackspade

Apr 12, 2024

<|system|>
You are a helpful assistant.</s>
<|user|>
Hello, how are you?</s>
<|assistant|>
I'm doing great. How can I help you today?</s>
<|user|>
Show me how to build a website in 10 simple steps</s>
<|assistant|>

sam-paech

Apr 12, 2024

Thanks!

lewtun

Hugging Face H4 org Apr 12, 2024

The Jinja chat template is also part of the tokenizer if you need it: https://huggingface.co./HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/blob/4e5568b3b7428916cc30b38c94b282707ee5a48e/tokenizer_config.json#L32

sam-paech

Apr 12, 2024

The Jinja chat template is also part of the tokenizer if you need it: https://huggingface.co./HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/blob/4e5568b3b7428916cc30b38c94b282707ee5a48e/tokenizer_config.json#L32

Does the text-generation pipeline automatically apply the tokenizer's chat template when used as per the example code?

I thought it needed to be applied with tokenizer.apply_chat_template, but maybe I missed the memo.

lewtun

Hugging Face H4 org Apr 12, 2024

Yeah the pipeline now does this automatically! https://github.com/huggingface/transformers/blob/caa5c65db1f4db617cdac2ad667ba62edf94dd98/src/transformers/pipelines/text_generation.py#L253

Rocketknight1

Apr 13, 2024

To be specific, a chat template is applied if the input looks like a chat in the style of the OpenAI API (i.e. a list of dicts with role and content keys). If you pass a single string, the pipeline won't try to apply a chat template to it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment