leaderboard-pr-bot commited on
Commit
db18571
1 Parent(s): 36f9906

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +106 -0
README.md CHANGED
@@ -109,6 +109,98 @@ model-index:
109
  source:
110
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
111
  name: Open LLM Leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
112
  ---
113
 
114
  **We have Llama-3 at home!**
@@ -137,3 +229,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
137
  |Winogrande (5-shot) |73.40|
138
  |GSM8k (5-shot) |68.01|
139
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  source:
110
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
111
  name: Open LLM Leaderboard
112
+ - task:
113
+ type: text-generation
114
+ name: Text Generation
115
+ dataset:
116
+ name: IFEval (0-Shot)
117
+ type: HuggingFaceH4/ifeval
118
+ args:
119
+ num_few_shot: 0
120
+ metrics:
121
+ - type: inst_level_strict_acc and prompt_level_strict_acc
122
+ value: 51.31
123
+ name: strict accuracy
124
+ source:
125
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
126
+ name: Open LLM Leaderboard
127
+ - task:
128
+ type: text-generation
129
+ name: Text Generation
130
+ dataset:
131
+ name: BBH (3-Shot)
132
+ type: BBH
133
+ args:
134
+ num_few_shot: 3
135
+ metrics:
136
+ - type: acc_norm
137
+ value: 37.25
138
+ name: normalized accuracy
139
+ source:
140
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
141
+ name: Open LLM Leaderboard
142
+ - task:
143
+ type: text-generation
144
+ name: Text Generation
145
+ dataset:
146
+ name: MATH Lvl 5 (4-Shot)
147
+ type: hendrycks/competition_math
148
+ args:
149
+ num_few_shot: 4
150
+ metrics:
151
+ - type: exact_match
152
+ value: 10.2
153
+ name: exact match
154
+ source:
155
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
156
+ name: Open LLM Leaderboard
157
+ - task:
158
+ type: text-generation
159
+ name: Text Generation
160
+ dataset:
161
+ name: GPQA (0-shot)
162
+ type: Idavidrein/gpqa
163
+ args:
164
+ num_few_shot: 0
165
+ metrics:
166
+ - type: acc_norm
167
+ value: 10.18
168
+ name: acc_norm
169
+ source:
170
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
171
+ name: Open LLM Leaderboard
172
+ - task:
173
+ type: text-generation
174
+ name: Text Generation
175
+ dataset:
176
+ name: MuSR (0-shot)
177
+ type: TAUR-Lab/MuSR
178
+ args:
179
+ num_few_shot: 0
180
+ metrics:
181
+ - type: acc_norm
182
+ value: 9.53
183
+ name: acc_norm
184
+ source:
185
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
186
+ name: Open LLM Leaderboard
187
+ - task:
188
+ type: text-generation
189
+ name: Text Generation
190
+ dataset:
191
+ name: MMLU-PRO (5-shot)
192
+ type: TIGER-Lab/MMLU-Pro
193
+ config: main
194
+ split: test
195
+ args:
196
+ num_few_shot: 5
197
+ metrics:
198
+ - type: acc
199
+ value: 32.39
200
+ name: accuracy
201
+ source:
202
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Ba2han/Llama-Phi-3_DoRA
203
+ name: Open LLM Leaderboard
204
  ---
205
 
206
  **We have Llama-3 at home!**
 
229
  |Winogrande (5-shot) |73.40|
230
  |GSM8k (5-shot) |68.01|
231
 
232
+
233
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
234
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Ba2han__Llama-Phi-3_DoRA)
235
+
236
+ | Metric |Value|
237
+ |-------------------|----:|
238
+ |Avg. |25.14|
239
+ |IFEval (0-Shot) |51.31|
240
+ |BBH (3-Shot) |37.25|
241
+ |MATH Lvl 5 (4-Shot)|10.20|
242
+ |GPQA (0-shot) |10.18|
243
+ |MuSR (0-shot) | 9.53|
244
+ |MMLU-PRO (5-shot) |32.39|
245
+