File size: 21,888 Bytes
8b19283
 
 
 
 
 
 
 
 
 
 
 
 
 
845d4c0
e63478c
8b19283
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67560ab
 
 
 
 
 
 
 
 
 
 
 
 
 
8b19283
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ff17311
 
8b19283
 
 
 
ff17311
8b19283
 
 
 
 
 
 
 
 
 
 
 
e63478c
8b19283
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
---
license: apache-2.0
datasets:
- teknium/OpenHermes-2.5
tags:
- axolotl
- 01-ai/Yi-1.5-9B-Chat
- finetune
---


# Hermes-2.5-Yi-1.5-9B-Chat

This model is a fine-tuned version of [01-ai/Yi-1.5-9B-Chat](https://huggingface.co./01-ai/Yi-1.5-9B-Chat) on the [teknium/OpenHermes-2.5](https://huggingface.co./datasets/teknium/OpenHermes-2.5) dataset.
I'm very happy with the results. The model now seems a lot smarter and "aware" in certain situations (first look, so I might change my opinion with more usage). It got quite an big edge on the AGIEval Benchmark for models in it's class.
I plan to extend its context length to 32k with POSE.

## Model Details

- **Base Model:** 01-ai/Yi-1.5-9B-Chat
- **chat-template:** chatml
- **Dataset:** teknium/OpenHermes-2.5
- **Sequence Length:** 8192 tokens
- **Training:**
- **Epochs:** 1
- **Hardware:** 4 Nodes x 4 NVIDIA A100 40GB GPUs
- **Duration:** 48:32:13
- **Cluster:** KIT SCC Cluster

## Benchmark n_shots=0


![image/png](https://cdn-uploads.huggingface.co/production/uploads/659c4ecb413a1376bee2f661/0wv3AMaoete7ysT005n89.png)

| Benchmark         | Score  |
|-------------------|--------|
| ARC (Challenge)   | 52.47% |
| ARC (Easy)        | 81.65% |
| BoolQ             | 87.22% |
| HellaSwag         | 60.52% |
| OpenBookQA        | 33.60% |
| PIQA              | 81.12% |
| Winogrande        | 72.22% |
| AGIEval           | 38.46% |
| TruthfulQA        | 44.22% |
| MMLU              | 59.72% |
| IFEval            | 47.96% |


For detailed benchmark results, including sub-categories and various metrics, please refer to the [full benchmark table](#full-benchmark-results) at the end of this README.

## GGUF and Quantizations

- llama.cpp [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)
- [juvi21/Hermes-2.5-Yi-1.5-9B-Chat-GGUF](https://huggingface.co./juvi21/Hermes-2.5-Yi-1.5-9B-Chat-GGUF) is availabe in:
- **F16** **Q8_0** **Q6_KQ5_K_M** **Q4_K_M** **Q3_K_M** **Q2_K**



## Usage

To use this model, you can load it using the Hugging Face Transformers library:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("juvi21/Hermes-2.5-Yi-1.5-9B-Chat")
tokenizer = AutoTokenizer.from_pretrained("juvi21/Hermes-2.5-Yi-1.5-9B-Chat")

# Generate text
input_text = "What is the question to 42?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

```

## chatml
```
<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
Knock Knock, who is there?<|im_end|>
<|im_start|>assistant
Hi there! <|im_end|>
```
## License

This model is released under the Apache 2.0 license.

## Acknowledgements

Special thanks to:
- Teknium for the great OpenHermes-2.5 dataset
- 01-ai for their great model
- KIT SCC for FLOPS

## Citation

If you use this model in your research, consider citing. Although definetly cite NousResearch and 01-ai:

```bibtex
@misc{
  author = {juvi21},
  title = Hermes-2.5-Yi-1.5-9B-Chat},
  year = {2024},
}
```
## full-benchmark-results

|                 Tasks                 |Version|Filter|n-shot|        Metric         |   | Value |   |Stderr|
|---------------------------------------|-------|------|-----:|-----------------------|---|------:|---|------|
|agieval                                |N/A    |none  |     0|acc                    |↑  | 0.5381|±  |0.0049|
|                                       |       |none  |     0|acc_norm               |↑  | 0.5715|±  |0.0056|
| - agieval_aqua_rat                    |      1|none  |     0|acc                    |↑  | 0.3858|±  |0.0306|
|                                       |       |none  |     0|acc_norm               |↑  | 0.3425|±  |0.0298|
| - agieval_gaokao_biology              |      1|none  |     0|acc                    |↑  | 0.6048|±  |0.0338|
|                                       |       |none  |     0|acc_norm               |↑  | 0.6000|±  |0.0339|
| - agieval_gaokao_chemistry            |      1|none  |     0|acc                    |↑  | 0.4879|±  |0.0348|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4106|±  |0.0343|
| - agieval_gaokao_chinese              |      1|none  |     0|acc                    |↑  | 0.5935|±  |0.0314|
|                                       |       |none  |     0|acc_norm               |↑  | 0.5813|±  |0.0315|
| - agieval_gaokao_english              |      1|none  |     0|acc                    |↑  | 0.8235|±  |0.0218|
|                                       |       |none  |     0|acc_norm               |↑  | 0.8431|±  |0.0208|
| - agieval_gaokao_geography            |      1|none  |     0|acc                    |↑  | 0.7085|±  |0.0323|
|                                       |       |none  |     0|acc_norm               |↑  | 0.6985|±  |0.0326|
| - agieval_gaokao_history              |      1|none  |     0|acc                    |↑  | 0.7830|±  |0.0269|
|                                       |       |none  |     0|acc_norm               |↑  | 0.7660|±  |0.0277|
| - agieval_gaokao_mathcloze            |      1|none  |     0|acc                    |↑  | 0.0508|±  |0.0203|
| - agieval_gaokao_mathqa               |      1|none  |     0|acc                    |↑  | 0.3761|±  |0.0259|
|                                       |       |none  |     0|acc_norm               |↑  | 0.3590|±  |0.0256|
| - agieval_gaokao_physics              |      1|none  |     0|acc                    |↑  | 0.4950|±  |0.0354|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4700|±  |0.0354|
| - agieval_jec_qa_ca                   |      1|none  |     0|acc                    |↑  | 0.6557|±  |0.0150|
|                                       |       |none  |     0|acc_norm               |↑  | 0.5926|±  |0.0156|
| - agieval_jec_qa_kd                   |      1|none  |     0|acc                    |↑  | 0.7310|±  |0.0140|
|                                       |       |none  |     0|acc_norm               |↑  | 0.6610|±  |0.0150|
| - agieval_logiqa_en                   |      1|none  |     0|acc                    |↑  | 0.5177|±  |0.0196|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4839|±  |0.0196|
| - agieval_logiqa_zh                   |      1|none  |     0|acc                    |↑  | 0.4854|±  |0.0196|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4501|±  |0.0195|
| - agieval_lsat_ar                     |      1|none  |     0|acc                    |↑  | 0.2913|±  |0.0300|
|                                       |       |none  |     0|acc_norm               |↑  | 0.2696|±  |0.0293|
| - agieval_lsat_lr                     |      1|none  |     0|acc                    |↑  | 0.7196|±  |0.0199|
|                                       |       |none  |     0|acc_norm               |↑  | 0.6824|±  |0.0206|
| - agieval_lsat_rc                     |      1|none  |     0|acc                    |↑  | 0.7212|±  |0.0274|
|                                       |       |none  |     0|acc_norm               |↑  | 0.6989|±  |0.0280|
| - agieval_math                        |      1|none  |     0|acc                    |↑  | 0.0910|±  |0.0091|
| - agieval_sat_en                      |      1|none  |     0|acc                    |↑  | 0.8204|±  |0.0268|
|                                       |       |none  |     0|acc_norm               |↑  | 0.8301|±  |0.0262|
| - agieval_sat_en_without_passage      |      1|none  |     0|acc                    |↑  | 0.5194|±  |0.0349|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4806|±  |0.0349|
| - agieval_sat_math                    |      1|none  |     0|acc                    |↑  | 0.5864|±  |0.0333|
|                                       |       |none  |     0|acc_norm               |↑  | 0.5409|±  |0.0337|
|arc_challenge                          |      1|none  |     0|acc                    |↑  | 0.5648|±  |0.0145|
|                                       |       |none  |     0|acc_norm               |↑  | 0.5879|±  |0.0144|
|arc_easy                               |      1|none  |     0|acc                    |↑  | 0.8241|±  |0.0078|
|                                       |       |none  |     0|acc_norm               |↑  | 0.8165|±  |0.0079|
|boolq                                  |      2|none  |     0|acc                    |↑  | 0.8624|±  |0.0060|
|hellaswag                              |      1|none  |     0|acc                    |↑  | 0.5901|±  |0.0049|
|                                       |       |none  |     0|acc_norm               |↑  | 0.7767|±  |0.0042|
|ifeval                                 |      2|none  |     0|inst_level_loose_acc   |↑  | 0.5156|±  |N/A   |
|                                       |       |none  |     0|inst_level_strict_acc  |↑  | 0.4748|±  |N/A   |
|                                       |       |none  |     0|prompt_level_loose_acc |↑  | 0.3863|±  |0.0210|
|                                       |       |none  |     0|prompt_level_strict_acc|↑  | 0.3309|±  |0.0202|
|mmlu                                   |N/A    |none  |     0|acc                    |↑  | 0.6942|±  |0.0037|
|  - abstract_algebra                   |      0|none  |     0|acc                    |↑  | 0.4900|±  |0.0502|
|  - anatomy                            |      0|none  |     0|acc                    |↑  | 0.6815|±  |0.0402|
|  - astronomy                          |      0|none  |     0|acc                    |↑  | 0.7895|±  |0.0332|
|  - business_ethics                    |      0|none  |     0|acc                    |↑  | 0.7600|±  |0.0429|
|  - clinical_knowledge                 |      0|none  |     0|acc                    |↑  | 0.7132|±  |0.0278|
|  - college_biology                    |      0|none  |     0|acc                    |↑  | 0.8056|±  |0.0331|
|  - college_chemistry                  |      0|none  |     0|acc                    |↑  | 0.5300|±  |0.0502|
|  - college_computer_science           |      0|none  |     0|acc                    |↑  | 0.6500|±  |0.0479|
|  - college_mathematics                |      0|none  |     0|acc                    |↑  | 0.4100|±  |0.0494|
|  - college_medicine                   |      0|none  |     0|acc                    |↑  | 0.6763|±  |0.0357|
|  - college_physics                    |      0|none  |     0|acc                    |↑  | 0.5000|±  |0.0498|
|  - computer_security                  |      0|none  |     0|acc                    |↑  | 0.8200|±  |0.0386|
|  - conceptual_physics                 |      0|none  |     0|acc                    |↑  | 0.7489|±  |0.0283|
|  - econometrics                       |      0|none  |     0|acc                    |↑  | 0.5877|±  |0.0463|
|  - electrical_engineering             |      0|none  |     0|acc                    |↑  | 0.6759|±  |0.0390|
|  - elementary_mathematics             |      0|none  |     0|acc                    |↑  | 0.6481|±  |0.0246|
|  - formal_logic                       |      0|none  |     0|acc                    |↑  | 0.5873|±  |0.0440|
|  - global_facts                       |      0|none  |     0|acc                    |↑  | 0.3900|±  |0.0490|
|  - high_school_biology                |      0|none  |     0|acc                    |↑  | 0.8613|±  |0.0197|
|  - high_school_chemistry              |      0|none  |     0|acc                    |↑  | 0.6453|±  |0.0337|
|  - high_school_computer_science       |      0|none  |     0|acc                    |↑  | 0.8300|±  |0.0378|
|  - high_school_european_history       |      0|none  |     0|acc                    |↑  | 0.8182|±  |0.0301|
|  - high_school_geography              |      0|none  |     0|acc                    |↑  | 0.8485|±  |0.0255|
|  - high_school_government_and_politics|      0|none  |     0|acc                    |↑  | 0.8964|±  |0.0220|
|  - high_school_macroeconomics         |      0|none  |     0|acc                    |↑  | 0.7923|±  |0.0206|
|  - high_school_mathematics            |      0|none  |     0|acc                    |↑  | 0.4407|±  |0.0303|
|  - high_school_microeconomics         |      0|none  |     0|acc                    |↑  | 0.8655|±  |0.0222|
|  - high_school_physics                |      0|none  |     0|acc                    |↑  | 0.5298|±  |0.0408|
|  - high_school_psychology             |      0|none  |     0|acc                    |↑  | 0.8679|±  |0.0145|
|  - high_school_statistics             |      0|none  |     0|acc                    |↑  | 0.6898|±  |0.0315|
|  - high_school_us_history             |      0|none  |     0|acc                    |↑  | 0.8873|±  |0.0222|
|  - high_school_world_history          |      0|none  |     0|acc                    |↑  | 0.8312|±  |0.0244|
|  - human_aging                        |      0|none  |     0|acc                    |↑  | 0.7085|±  |0.0305|
|  - human_sexuality                    |      0|none  |     0|acc                    |↑  | 0.7557|±  |0.0377|
| - humanities                          |N/A    |none  |     0|acc                    |↑  | 0.6323|±  |0.0067|
|  - international_law                  |      0|none  |     0|acc                    |↑  | 0.8099|±  |0.0358|
|  - jurisprudence                      |      0|none  |     0|acc                    |↑  | 0.7685|±  |0.0408|
|  - logical_fallacies                  |      0|none  |     0|acc                    |↑  | 0.7975|±  |0.0316|
|  - machine_learning                   |      0|none  |     0|acc                    |↑  | 0.5179|±  |0.0474|
|  - management                         |      0|none  |     0|acc                    |↑  | 0.8835|±  |0.0318|
|  - marketing                          |      0|none  |     0|acc                    |↑  | 0.9017|±  |0.0195|
|  - medical_genetics                   |      0|none  |     0|acc                    |↑  | 0.8000|±  |0.0402|
|  - miscellaneous                      |      0|none  |     0|acc                    |↑  | 0.8225|±  |0.0137|
|  - moral_disputes                     |      0|none  |     0|acc                    |↑  | 0.7283|±  |0.0239|
|  - moral_scenarios                    |      0|none  |     0|acc                    |↑  | 0.4860|±  |0.0167|
|  - nutrition                          |      0|none  |     0|acc                    |↑  | 0.7353|±  |0.0253|
| - other                               |N/A    |none  |     0|acc                    |↑  | 0.7287|±  |0.0077|
|  - philosophy                         |      0|none  |     0|acc                    |↑  | 0.7170|±  |0.0256|
|  - prehistory                         |      0|none  |     0|acc                    |↑  | 0.7346|±  |0.0246|
|  - professional_accounting            |      0|none  |     0|acc                    |↑  | 0.5638|±  |0.0296|
|  - professional_law                   |      0|none  |     0|acc                    |↑  | 0.5163|±  |0.0128|
|  - professional_medicine              |      0|none  |     0|acc                    |↑  | 0.6875|±  |0.0282|
|  - professional_psychology            |      0|none  |     0|acc                    |↑  | 0.7092|±  |0.0184|
|  - public_relations                   |      0|none  |     0|acc                    |↑  | 0.6727|±  |0.0449|
|  - security_studies                   |      0|none  |     0|acc                    |↑  | 0.7347|±  |0.0283|
| - social_sciences                     |N/A    |none  |     0|acc                    |↑  | 0.7910|±  |0.0072|
|  - sociology                          |      0|none  |     0|acc                    |↑  | 0.8060|±  |0.0280|
| - stem                                |N/A    |none  |     0|acc                    |↑  | 0.6581|±  |0.0081|
|  - us_foreign_policy                  |      0|none  |     0|acc                    |↑  | 0.8900|±  |0.0314|
|  - virology                           |      0|none  |     0|acc                    |↑  | 0.5301|±  |0.0389|
|  - world_religions                    |      0|none  |     0|acc                    |↑  | 0.8012|±  |0.0306|
|openbookqa                             |      1|none  |     0|acc                    |↑  | 0.3280|±  |0.0210|
|                                       |       |none  |     0|acc_norm               |↑  | 0.4360|±  |0.0222|
|piqa                                   |      1|none  |     0|acc                    |↑  | 0.7982|±  |0.0094|
|                                       |       |none  |     0|acc_norm               |↑  | 0.8074|±  |0.0092|
|truthfulqa                             |N/A    |none  |     0|acc                    |↑  | 0.4746|±  |0.0116|
|                                       |       |none  |     0|bleu_acc               |↑  | 0.4700|±  |0.0175|
|                                       |       |none  |     0|bleu_diff              |↑  | 0.3214|±  |0.6045|
|                                       |       |none  |     0|bleu_max               |↑  |22.5895|±  |0.7122|
|                                       |       |none  |     0|rouge1_acc             |↑  | 0.4798|±  |0.0175|
|                                       |       |none  |     0|rouge1_diff            |↑  | 0.0846|±  |0.7161|
|                                       |       |none  |     0|rouge1_max             |↑  |48.7180|±  |0.7833|
|                                       |       |none  |     0|rouge2_acc             |↑  | 0.4149|±  |0.0172|
|                                       |       |none  |     0|rouge2_diff            |↑  |-0.4656|±  |0.8375|
|                                       |       |none  |     0|rouge2_max             |↑  |34.0585|±  |0.8974|
|                                       |       |none  |     0|rougeL_acc             |↑  | 0.4651|±  |0.0175|
|                                       |       |none  |     0|rougeL_diff            |↑  |-0.2804|±  |0.7217|
|                                       |       |none  |     0|rougeL_max             |↑  |45.2232|±  |0.7971|
| - truthfulqa_gen                      |      3|none  |     0|bleu_acc               |↑  | 0.4700|±  |0.0175|
|                                       |       |none  |     0|bleu_diff              |↑  | 0.3214|±  |0.6045|
|                                       |       |none  |     0|bleu_max               |↑  |22.5895|±  |0.7122|
|                                       |       |none  |     0|rouge1_acc             |↑  | 0.4798|±  |0.0175|
|                                       |       |none  |     0|rouge1_diff            |↑  | 0.0846|±  |0.7161|
|                                       |       |none  |     0|rouge1_max             |↑  |48.7180|±  |0.7833|
|                                       |       |none  |     0|rouge2_acc             |↑  | 0.4149|±  |0.0172|
|                                       |       |none  |     0|rouge2_diff            |↑  |-0.4656|±  |0.8375|
|                                       |       |none  |     0|rouge2_max             |↑  |34.0585|±  |0.8974|
|                                       |       |none  |     0|rougeL_acc             |↑  | 0.4651|±  |0.0175|
|                                       |       |none  |     0|rougeL_diff            |↑  |-0.2804|±  |0.7217|
|                                       |       |none  |     0|rougeL_max             |↑  |45.2232|±  |0.7971|
| - truthfulqa_mc1                      |      2|none  |     0|acc                    |↑  | 0.3905|±  |0.0171|
| - truthfulqa_mc2                      |      2|none  |     0|acc                    |↑  | 0.5587|±  |0.0156|
|winogrande                             |      1|none  |     0|acc                    |↑  | 0.7388|±  |0.0123|

|      Groups      |Version|Filter|n-shot|  Metric   |   | Value |   |Stderr|
|------------------|-------|------|-----:|-----------|---|------:|---|-----:|
|agieval           |N/A    |none  |     0|acc        |↑  | 0.5381|±  |0.0049|
|                  |       |none  |     0|acc_norm   |↑  | 0.5715|±  |0.0056|
|mmlu              |N/A    |none  |     0|acc        |↑  | 0.6942|±  |0.0037|
| - humanities     |N/A    |none  |     0|acc        |↑  | 0.6323|±  |0.0067|
| - other          |N/A    |none  |     0|acc        |↑  | 0.7287|±  |0.0077|
| - social_sciences|N/A    |none  |     0|acc        |↑  | 0.7910|±  |0.0072|
| - stem           |N/A    |none  |     0|acc        |↑  | 0.6581|±  |0.0081|
|truthfulqa        |N/A    |none  |     0|acc        |↑  | 0.4746|±  |0.0116|
|                  |       |none  |     0|bleu_acc   |↑  | 0.4700|±  |0.0175|
|                  |       |none  |     0|bleu_diff  |↑  | 0.3214|±  |0.6045|
|                  |       |none  |     0|bleu_max   |↑  |22.5895|±  |0.7122|
|                  |       |none  |     0|rouge1_acc |↑  | 0.4798|±  |0.0175|
|                  |       |none  |     0|rouge1_diff|↑  | 0.0846|±  |0.7161|
|                  |       |none  |     0|rouge1_max |↑  |48.7180|±  |0.7833|
|                  |       |none  |     0|rouge2_acc |↑  | 0.4149|±  |0.0172|
|                  |       |none  |     0|rouge2_diff|↑  |-0.4656|±  |0.8375|
|                  |       |none  |     0|rouge2_max |↑  |34.0585|±  |0.8974|
|                  |       |none  |     0|rougeL_acc |↑  | 0.4651|±  |0.0175|
|                  |       |none  |     0|rougeL_diff|↑  |-0.2804|±  |0.7217|
|                  |       |none  |     0|rougeL_max |↑  |45.2232|±  |0.7971|