mlabonne commited on
Commit
a3fc909
Β·
verified Β·
1 Parent(s): de84d9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +204 -14
README.md CHANGED
@@ -1,36 +1,226 @@
1
  ---
2
- license: mit
3
- base_model:
4
- - PRIME-RL/Eurus-2-7B-PRIME
5
- - Qwen/Qwen2.5-7B-Instruct
6
  tags:
7
  - merge
8
  - mergekit
9
  - lazymergekit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  # Daredevil-8B
13
 
14
- Daredevil-8B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
15
- * [PRIME-RL/Eurus-2-7B-PRIME](https://huggingface.co/PRIME-RL/Eurus-2-7B-PRIME)
16
- * [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## 🧩 Configuration
19
 
20
  ```yaml
21
  models:
22
- - model: Qwen/Qwen-2.5-Math-7B
23
  # No parameters necessary for base model
24
- - model: PRIME-RL/Eurus-2-7B-PRIME
 
 
 
 
 
 
 
 
 
 
 
 
25
  parameters:
26
  density: 0.56
27
- weight: 0.5
28
- - model: Qwen/Qwen2.5-7B-Instruct
 
 
 
 
29
  parameters:
30
  density: 0.56
31
- weight: 0.5
 
 
 
 
 
 
 
 
 
 
 
 
32
  merge_method: dare_ties
33
- base_model: Qwen/Qwen-2.5-Math-7B
34
  dtype: bfloat16
35
  ```
36
 
@@ -51,7 +241,7 @@ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_
51
  pipeline = transformers.pipeline(
52
  "text-generation",
53
  model=model,
54
- torch_dtype=torch.float16,
55
  device_map="auto",
56
  )
57
 
 
1
  ---
2
+ license: other
 
 
 
3
  tags:
4
  - merge
5
  - mergekit
6
  - lazymergekit
7
+ base_model:
8
+ - nbeerbower/llama-3-stella-8B
9
+ - Hastagaras/llama-3-8b-okay
10
+ - nbeerbower/llama-3-gutenberg-8B
11
+ - openchat/openchat-3.6-8b-20240522
12
+ - Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
13
+ - cstr/llama3-8b-spaetzle-v20
14
+ - mlabonne/ChimeraLlama-3-8B-v3
15
+ - flammenai/Mahou-1.1-llama3-8B
16
+ - KingNish/KingNish-Llama3-8b
17
+ model-index:
18
+ - name: Daredevil-8B
19
+ results:
20
+ - task:
21
+ type: text-generation
22
+ name: Text Generation
23
+ dataset:
24
+ name: AI2 Reasoning Challenge (25-Shot)
25
+ type: ai2_arc
26
+ config: ARC-Challenge
27
+ split: test
28
+ args:
29
+ num_few_shot: 25
30
+ metrics:
31
+ - type: acc_norm
32
+ value: 68.86
33
+ name: normalized accuracy
34
+ source:
35
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
36
+ name: Open LLM Leaderboard
37
+ - task:
38
+ type: text-generation
39
+ name: Text Generation
40
+ dataset:
41
+ name: HellaSwag (10-Shot)
42
+ type: hellaswag
43
+ split: validation
44
+ args:
45
+ num_few_shot: 10
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 84.5
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: MMLU (5-Shot)
58
+ type: cais/mmlu
59
+ config: all
60
+ split: test
61
+ args:
62
+ num_few_shot: 5
63
+ metrics:
64
+ - type: acc
65
+ value: 69.24
66
+ name: accuracy
67
+ source:
68
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
69
+ name: Open LLM Leaderboard
70
+ - task:
71
+ type: text-generation
72
+ name: Text Generation
73
+ dataset:
74
+ name: TruthfulQA (0-shot)
75
+ type: truthful_qa
76
+ config: multiple_choice
77
+ split: validation
78
+ args:
79
+ num_few_shot: 0
80
+ metrics:
81
+ - type: mc2
82
+ value: 59.89
83
+ source:
84
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: Winogrande (5-shot)
91
+ type: winogrande
92
+ config: winogrande_xl
93
+ split: validation
94
+ args:
95
+ num_few_shot: 5
96
+ metrics:
97
+ - type: acc
98
+ value: 78.45
99
+ name: accuracy
100
+ source:
101
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
102
+ name: Open LLM Leaderboard
103
+ - task:
104
+ type: text-generation
105
+ name: Text Generation
106
+ dataset:
107
+ name: GSM8k (5-shot)
108
+ type: gsm8k
109
+ config: main
110
+ split: test
111
+ args:
112
+ num_few_shot: 5
113
+ metrics:
114
+ - type: acc
115
+ value: 73.54
116
+ name: accuracy
117
+ source:
118
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mlabonne/Daredevil-8B
119
+ name: Open LLM Leaderboard
120
  ---
121
 
122
  # Daredevil-8B
123
 
124
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/gFEhcIDSKa3AWpkNfH91q.jpeg)
125
+
126
+ Daredevil-8B is a mega-merge designed to maximize MMLU. On 27 May 24, it is the Llama 3 8B model with the **highest MMLU score**.
127
+ From my experience, a high MMLU score is all you need with Llama 3 models.
128
+
129
+ It is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
130
+ * [nbeerbower/llama-3-stella-8B](https://huggingface.co/nbeerbower/llama-3-stella-8B)
131
+ * [Hastagaras/llama-3-8b-okay](https://huggingface.co/Hastagaras/llama-3-8b-okay)
132
+ * [nbeerbower/llama-3-gutenberg-8B](https://huggingface.co/nbeerbower/llama-3-gutenberg-8B)
133
+ * [openchat/openchat-3.6-8b-20240522](https://huggingface.co/openchat/openchat-3.6-8b-20240522)
134
+ * [Kukedlc/NeuralLLaMa-3-8b-DT-v0.1](https://huggingface.co/Kukedlc/NeuralLLaMa-3-8b-DT-v0.1)
135
+ * [cstr/llama3-8b-spaetzle-v20](https://huggingface.co/cstr/llama3-8b-spaetzle-v20)
136
+ * [mlabonne/ChimeraLlama-3-8B-v3](https://huggingface.co/mlabonne/ChimeraLlama-3-8B-v3)
137
+ * [flammenai/Mahou-1.1-llama3-8B](https://huggingface.co/flammenai/Mahou-1.1-llama3-8B)
138
+ * [KingNish/KingNish-Llama3-8b](https://huggingface.co/KingNish/KingNish-Llama3-8b)
139
+
140
+ Thanks to nbeerbower, Hastagaras, openchat, Kukedlc, cstr, flammenai, and KingNish for their merges. Special thanks to Charles Goddard and Arcee.ai for MergeKit.
141
+
142
+ ## πŸ”Ž Applications
143
+
144
+ You can use it as an improved version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
145
+
146
+ This is a censored model. For an uncensored version, see [mlabonne/Daredevil-8B-abliterated](https://huggingface.co/mlabonne/Daredevil-8B-abliterated).
147
+
148
+ Tested on LM Studio using the "Llama 3" preset.
149
+
150
+ ## ⚑ Quantization
151
+
152
+ * **GGUF**: https://huggingface.co/mlabonne/Daredevil-8B-GGUF
153
+
154
+ ## πŸ† Evaluation
155
+
156
+ ### Open LLM Leaderboard
157
+
158
+ Daredevil-8B is the best-performing 8B model on the Open LLM Leaderboard in terms of MMLU score (27 May 24).
159
+
160
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/xFKhGdSaIxL9_tcJPhM5w.png)
161
+
162
+ ### Nous
163
+
164
+ Daredevil-8B is the best-performing 8B model on Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval), 27 May 24). See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
165
+
166
+ | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
167
+ |---|---:|---:|---:|---:|---:|
168
+ | [**mlabonne/Daredevil-8B**](https://huggingface.co/mlabonne/Daredevil-8B) [πŸ“„](https://gist.github.com/mlabonne/080f9c5f153ea57a7ab7d932cf896f21) | **55.87** | **44.13** | **73.52** | **59.05** | **46.77** |
169
+ | [mlabonne/Daredevil-8B-abliterated](https://huggingface.co/mlabonne/Daredevil-8B-abliterated) [πŸ“„](https://gist.github.com/mlabonne/32cdd8460804662c856bcb2a20acd49e) | 55.06 | 43.29 | 73.33 | 57.47 | 46.17 |
170
+ | [mlabonne/Llama-3-8B-Instruct-abliterated-dpomix](https://huggingface.co/mlabonne/Llama-3-8B-Instruct-abliterated-dpomix) [πŸ“„](https://gist.github.com/mlabonne/d711548df70e2c04771cc68ab33fe2b9) | 52.26 | 41.6 | 69.95 | 54.22 | 43.26 |
171
+ | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [πŸ“„](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) | 51.34 | 41.22 | 69.86 | 51.65 | 42.64 |
172
+ | [failspy/Meta-Llama-3-8B-Instruct-abliterated-v3](https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3) [πŸ“„](https://gist.github.com/mlabonne/f46cce0262443365e4cce2b6fa7507fc) | 51.21 | 40.23 | 69.5 | 52.44 | 42.69 |
173
+ | [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/22896a1ae164859931cc8f4858c97f6f) | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
174
+ | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [πŸ“„](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847) | 45.42 | 31.1 | 69.95 | 43.91 | 36.7 |
175
+
176
+ ## 🌳 Model family tree
177
+
178
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/ekwRGgnjzEOyprT8sEBFt.png)
179
 
180
  ## 🧩 Configuration
181
 
182
  ```yaml
183
  models:
184
+ - model: NousResearch/Meta-Llama-3-8B
185
  # No parameters necessary for base model
186
+ - model: nbeerbower/llama-3-stella-8B
187
+ parameters:
188
+ density: 0.6
189
+ weight: 0.16
190
+ - model: Hastagaras/llama-3-8b-okay
191
+ parameters:
192
+ density: 0.56
193
+ weight: 0.1
194
+ - model: nbeerbower/llama-3-gutenberg-8B
195
+ parameters:
196
+ density: 0.6
197
+ weight: 0.18
198
+ - model: openchat/openchat-3.6-8b-20240522
199
  parameters:
200
  density: 0.56
201
+ weight: 0.12
202
+ - model: Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
203
+ parameters:
204
+ density: 0.58
205
+ weight: 0.18
206
+ - model: cstr/llama3-8b-spaetzle-v20
207
  parameters:
208
  density: 0.56
209
+ weight: 0.08
210
+ - model: mlabonne/ChimeraLlama-3-8B-v3
211
+ parameters:
212
+ density: 0.56
213
+ weight: 0.08
214
+ - model: flammenai/Mahou-1.1-llama3-8B
215
+ parameters:
216
+ density: 0.55
217
+ weight: 0.05
218
+ - model: KingNish/KingNish-Llama3-8b
219
+ parameters:
220
+ density: 0.55
221
+ weight: 0.05
222
  merge_method: dare_ties
223
+ base_model: NousResearch/Meta-Llama-3-8B
224
  dtype: bfloat16
225
  ```
226
 
 
241
  pipeline = transformers.pipeline(
242
  "text-generation",
243
  model=model,
244
+ torch_dtype=torch.bfloat16,
245
  device_map="auto",
246
  )
247