fireballoon
/

baichuan-vicuna-7b

Text Generation

text-generation-inference

Model card Files Files and versions Community

fireballoon commited on Jun 17, 2023

Commit

9cbd412

•

1 Parent(s): b5998e0

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -36,6 +36,7 @@ Inference with Transformers:
 >>> from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
 >>> tokenizer = AutoTokenizer.from_pretrained("fireballoon/baichuan-vicuna-7b", use_fast=False)
 >>> model = AutoModelForCausalLM.from_pretrained("fireballoon/baichuan-vicuna-7b").half().cuda()
 >>> instruction = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {} ASSISTANT:"
 >>> prompt = instruction.format("five tips to help with sleep")  # user message
 >>> generate_ids = model.generate(tokenizer(prompt, return_tensors='pt').input_ids.cuda(), max_new_tokens=2048, streamer=streamer)

 >>> from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
 >>> tokenizer = AutoTokenizer.from_pretrained("fireballoon/baichuan-vicuna-7b", use_fast=False)
 >>> model = AutoModelForCausalLM.from_pretrained("fireballoon/baichuan-vicuna-7b").half().cuda()
+>>> streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
 >>> instruction = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {} ASSISTANT:"
 >>> prompt = instruction.format("five tips to help with sleep")  # user message
 >>> generate_ids = model.generate(tokenizer(prompt, return_tensors='pt').input_ids.cuda(), max_new_tokens=2048, streamer=streamer)