weezywitasneezy
commited on
Commit
•
6242373
1
Parent(s):
3504971
Update README.md
Browse files
README.md
CHANGED
@@ -116,10 +116,26 @@ model-index:
|
|
116 |
|
117 |
# BenchmarkEngineering-F2-7B-slerp
|
118 |
|
|
|
|
|
119 |
BenchmarkEngineering-F2-7B-slerp is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
|
120 |
* [weezywitasneezy/BenchmarkEngineering-7B-slerp](https://huggingface.co/weezywitasneezy/BenchmarkEngineering-7B-slerp)
|
121 |
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
|
122 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
## 🧩 Configuration
|
124 |
|
125 |
```yaml
|
@@ -165,16 +181,5 @@ pipeline = transformers.pipeline(
|
|
165 |
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
|
166 |
print(outputs[0]["generated_text"])
|
167 |
```
|
168 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
169 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_weezywitasneezy__BenchmarkEngineering-F2-7B-slerp)
|
170 |
|
171 |
-
| Metric |Value|
|
172 |
-
|---------------------------------|----:|
|
173 |
-
|Avg. |75.77|
|
174 |
-
|AI2 Reasoning Challenge (25-Shot)|73.46|
|
175 |
-
|HellaSwag (10-Shot) |88.88|
|
176 |
-
|MMLU (5-Shot) |64.50|
|
177 |
-
|TruthfulQA (0-shot) |72.37|
|
178 |
-
|Winogrande (5-shot) |86.11|
|
179 |
-
|GSM8k (5-shot) |69.29|
|
180 |
|
|
|
116 |
|
117 |
# BenchmarkEngineering-F2-7B-slerp
|
118 |
|
119 |
+
This merge seeks to further improve on the original BenchmarkEngineering by integrating the Westlake-7B-v2 model. It does boost the Winogrande score but at the cost of the other benchmarks.
|
120 |
+
|
121 |
BenchmarkEngineering-F2-7B-slerp is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
|
122 |
* [weezywitasneezy/BenchmarkEngineering-7B-slerp](https://huggingface.co/weezywitasneezy/BenchmarkEngineering-7B-slerp)
|
123 |
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
|
124 |
|
125 |
+
|
126 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
127 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_weezywitasneezy__BenchmarkEngineering-F2-7B-slerp)
|
128 |
+
|
129 |
+
| Metric |Value|
|
130 |
+
|---------------------------------|----:|
|
131 |
+
|Avg. |75.77|
|
132 |
+
|AI2 Reasoning Challenge (25-Shot)|73.46|
|
133 |
+
|HellaSwag (10-Shot) |88.88|
|
134 |
+
|MMLU (5-Shot) |64.50|
|
135 |
+
|TruthfulQA (0-shot) |72.37|
|
136 |
+
|Winogrande (5-shot) |86.11|
|
137 |
+
|GSM8k (5-shot) |69.29|
|
138 |
+
|
139 |
## 🧩 Configuration
|
140 |
|
141 |
```yaml
|
|
|
181 |
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
|
182 |
print(outputs[0]["generated_text"])
|
183 |
```
|
|
|
|
|
184 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
185 |
|