zstanjj commited on
Commit
e906eb8
1 Parent(s): feab9dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: llama2
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
  ---
4
+ <!-- markdownlint-disable first-line-h1 -->
5
+ <!-- markdownlint-disable html -->
6
+
7
+ <div align="center">
8
+ <h1>
9
+ SlimPLM
10
+ </h1>
11
+ </div>
12
+
13
+ <p align="center">
14
+ 📝 <a href="https://arxiv.org/abs/2402.12052" target="_blank">Paper</a> • 🤗 <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> • 🧩 <a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
15
+ </p>
16
+
17
+ <div align="center">
18
+ </div>
19
+
20
+ ## ✨ Latest News
21
+
22
+ - [1/25/2024]: Search Necessity Judgment Model released in [Hugging Face](https://huggingface.co/zstanjj/SlimPLM-Search-Necessity-Judgment/).
23
+ - [2/20/2024]: Query Rewriting Model released in [Hugging Face](https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/).
24
+
25
+ ## 🎬 Get Started
26
+
27
+ ```python
28
+ from transformers import AutoModelForCausalLM, AutoTokenizer
29
+ import torch
30
+
31
+ # construct prompt
32
+ question = "Who voices Darth Vader in Star Wars Episodes III-VI, IX Rogue One, and Rebels?"
33
+ heuristic_answer = "The voice of Darth Vader in Star Wars is provided by British actor James Earl Jones. He first voiced the character in the 1977 film \"Star Wars: Episode IV - A New Hope\", and his performance has been used in all subsequent Star Wars films, including the prequels and sequels."
34
+ prompt = (f"<s>[INST] <<SYS>>\nYou are a helpful assistant. Your task is to parse user input into"
35
+ f" structured formats according to the coarse answer. Current datatime is 2023-12-20 9:47:28"
36
+ f" <</SYS>>\n Course answer: (({heuristic_answer}))\nQuestion: (({question})) [/INST]")
37
+ params_query_rewrite = {"repetition_penalty": 1.05, "temperature": 0.01, "top_k": 1, "top_p": 0.85,
38
+ "max_new_tokens": 512, "do_sample": False, "seed": 2023}
39
+
40
+ # deploy model
41
+ model = AutoModelForCausalLM.from_pretrained("zstanjj/SlimPLM-Search-Necessity-Judgment").eval()
42
+ if torch.cuda.is_available():
43
+ model.cuda()
44
+ tokenizer = AutoTokenizer.from_pretrained("zstanjj/SlimPLM-Search-Necessity-Judgment")
45
+
46
+ # run inference
47
+ input_ids = tokenizer.encode(question, return_tensors="pt")
48
+ len_input_ids = len(input_ids[0])
49
+ if torch.cuda.is_available():
50
+ input_ids = input_ids.cuda()
51
+ outputs = model.generate(input_ids)
52
+ res = tokenizer.decode(outputs[0][len_input_ids:], skip_special_tokens=True)
53
+ print(res)
54
+ ```
55
+
56
+ ## ✏️ Citation
57
+
58
+ ```
59
+ @inproceedings{Tan2024SmallMB,
60
+ title={Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs},
61
+ author={Jiejun Tan and Zhicheng Dou and Yutao Zhu and Peidong Guo and Kun Fang and Jinhui Wen},
62
+ year={2024},
63
+ url={https://api.semanticscholar.org/CorpusID:267750726}
64
+ }
65
+ ```