Text Generation
Transformers
PyTorch
llama
text-generation-inference
PengQu commited on
Commit
9a7a174
1 Parent(s): 0854599

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -5
README.md CHANGED
@@ -9,10 +9,8 @@ datasets:
9
 
10
 
11
  **NOTE: This "delta model" cannot be used directly.**
12
- Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights.
13
  See https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-weights for instructions.
14
- <br>
15
- <br>
16
 
17
  # vicuna-13b-finetuned-langchain-MRKL
18
 
@@ -21,9 +19,56 @@ See https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-w
21
  **Model type:**
22
  vicuna-13b-finetuned-langchain-MRKL is an open-source chatbot trained by fine-tuning vicuna-13b on 15 examples with langchain-MRKL format.
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  **Where to send questions or comments about the model:**
25
- https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/issues
26
 
 
27
 
28
  ## Training dataset
29
  train only one epoch on mix data (sharegpt + 32*my.json + moss-003-sft-data)
@@ -38,5 +83,4 @@ train only one epoch on mix data (sharegpt + 32*my.json + moss-003-sft-data)
38
  - very fast because of stritcly format(it doesn't generate redundant tokens)
39
 
40
  ## Author
41
-
42
  Qu Peng (https://huggingface.co/PengQu)
 
9
 
10
 
11
  **NOTE: This "delta model" cannot be used directly.**
12
+ Users have to apply it on top of the original LLaMA weights to get actual vicuna-13b-finetuned-langchain-MRKL weights.
13
  See https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-weights for instructions.
 
 
14
 
15
  # vicuna-13b-finetuned-langchain-MRKL
16
 
 
19
  **Model type:**
20
  vicuna-13b-finetuned-langchain-MRKL is an open-source chatbot trained by fine-tuning vicuna-13b on 15 examples with langchain-MRKL format.
21
 
22
+ **Model Usage:**
23
+
24
+ To obtain the correct model, plese run apply_delta.py first.(https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/model/apply_delta.py) See instructions https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL#model-weights
25
+
26
+ ```python
27
+ from transformers import AutoTokenizer, AutoModelForCausalLM
28
+
29
+ tokenizer = AutoTokenizer.from_pretrained("path/to/vicuna-13b-finetuned-langchain-MRKL")
30
+ model = AutoModelForCausalLM.from_pretrained("path/to/vicuna-13b-finetuned-langchain-MRKL")
31
+ model.cuda()
32
+
33
+ prompt = """Answer the following questions as best you can. You have access to the following tools:
34
+
35
+ Search: useful for when you need to answer questions about current events
36
+ Calculator: useful for when you need to answer questions about math
37
+
38
+ Use the following format:
39
+
40
+ Question: the input question you must answer
41
+ Thought: you should always think about what to do
42
+ Action: the action to take, should be one of [Search, Calculator]
43
+ Action Input: the input to the action
44
+ Observation: the result of the action
45
+ ... (this Thought/Action/Action Input/Observation can repeat N times)
46
+ Thought: I now know the final answer
47
+ Final Answer: the final answer to the original input question
48
+
49
+ Begin!
50
+
51
+ Question: The current age of the President of the United States multiplied by 0.5.
52
+ Thought:"""
53
+
54
+ input_ids = tokenizer(prompt, return_tensors='pt').input_ids.to("cuda")
55
+ tokens = model.generate(input_ids,min_length = 5, max_new_tokens=128,do_sample = True, temperature = 0.7, top_p = 0.9)
56
+ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
57
+ ```
58
+
59
+ output(The tokens after "Thought:"):<br>
60
+ ```sh
61
+ I need to find the current age of the President and then multiply it by 0.5
62
+ Action: Search
63
+ Action Input: Who is the President of the United States?
64
+ ```
65
+
66
+ if you launched a httpserver with the model and installed langchain(https://github.com/hwchase17/langchain), you can modify demo.py to your httpserver's ip&port, then run it.(https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/demo.py)<br>
67
+ you can also try this by Jupyter Notebook. https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/blob/main/demo.ipynb
68
+
69
  **Where to send questions or comments about the model:**
 
70
 
71
+ https://github.com/rinnakk/vicuna-13b-delta-finetuned-langchain-MRKL/issues
72
 
73
  ## Training dataset
74
  train only one epoch on mix data (sharegpt + 32*my.json + moss-003-sft-data)
 
83
  - very fast because of stritcly format(it doesn't generate redundant tokens)
84
 
85
  ## Author
 
86
  Qu Peng (https://huggingface.co/PengQu)