Update README.md
Browse files
README.md
CHANGED
@@ -195,7 +195,8 @@ Average Score Comparison between OpenHermes-1 Llama-2 13B and OpenHermes-2 Mistr
|
|
195 |
|
196 |
**HumanEval:**
|
197 |
On code tasks, I first set out to make a hermes-2 coder, but found that it can have generalist improvements to the model, so I settled for slightly less code capabilities, for maximum generalist ones. That said, code capabilities had a decent jump alongside the overall capabilities of the model:
|
198 |
-
HumanEval
|
|
|
199 |
|
200 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/IeeZnGmEyK73ejq0fKEms.png)
|
201 |
|
|
|
195 |
|
196 |
**HumanEval:**
|
197 |
On code tasks, I first set out to make a hermes-2 coder, but found that it can have generalist improvements to the model, so I settled for slightly less code capabilities, for maximum generalist ones. That said, code capabilities had a decent jump alongside the overall capabilities of the model:
|
198 |
+
Glaive performed HumanEval testing on Hermes-2.5 and found a score of
|
199 |
+
**HumanEval: 50.7% @ Pass1**
|
200 |
|
201 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/IeeZnGmEyK73ejq0fKEms.png)
|
202 |
|