Limerobot commited on
Commit
068743a
·
1 Parent(s): c760a20

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # **Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!**
6
+
7
+
8
+ # **Introduction**
9
+
10
+ We introduce the first 10.7 billion (B) parameter model, [SOLAR-10.7B](https://huggingface.co/upstage/SOLAR-10.7B-v1.0). It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.
11
+
12
+ We developed the Depth Up-Scaling technique. Built on the Llama2 architecture, [SOLAR-10.7B](https://huggingface.co/upstage/SOLAR-10.7B-v1.0) incorporates the innovative Upstage Depth Up-Scaling. We then integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.
13
+
14
+ Depth-Upscaled SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table ([link to be updated soon]).
15
+ Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements. [[link to be updated soon]]
16
+
17
+
18
+ # **Usage Instructions**
19
+
20
+ This model has been fine-tuned primarily for single-turn interactions, making it less suitable for multi-turn chat purposes.
21
+
22
+ ### **Version**
23
+
24
+ Make sure you have the correct version of the transformers library installed:
25
+
26
+ ```sh
27
+ pip install transformers==4.35.2
28
+ ```
29
+
30
+ ### **Loading the Model**
31
+
32
+ Use the following Python code to load the model:
33
+
34
+ ```python
35
+ import torch
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ tokenizer = AutoTokenizer.from_pretrained("Upstage/SOLAR-10.7B-Instruct-v1.0")
39
+ model = AutoModelForCausalLM.from_pretrained(
40
+ "Upstage/SOLAR-10.7B-Instruct-v1.0",
41
+ device_map="auto",
42
+ torch_dtype=torch.float16,
43
+ )
44
+ ```
45
+
46
+ ### **Conducting Single-Turn Conversation**
47
+
48
+ ```python
49
+ conversation = [ {'role': 'user', 'content': 'Hello?'} ]
50
+
51
+ prompt = tokenizer.apply_chat_template(conversation, tokenize=False, add_generation_prompt=True)
52
+
53
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
54
+ outputs = model.generate(**inputs, use_cache=True, max_length=4096) output_text = tokenizer.decode(outputs[0])
55
+ print(output_text)
56
+ ```
57
+
58
+ Below is an example of the output.
59
+ ```
60
+ <s> <|im_start|>user
61
+ Hello?<|im_end|>
62
+ <|im_start|>assistant
63
+ Hello, how can I assist you today?</s>
64
+
65
+ ```