Adding Evaluation Results
#11
by
leaderboard-pr-bot
- opened
README.md
CHANGED
@@ -367,3 +367,17 @@ The model is licensed under the LLaMA model, and the dataset is licensed under t
|
|
367 |
## Author
|
368 |
|
369 |
[JUNE LEE](https://github.com/melodysdreamj) - He is active in Songdo Artificial Intelligence Study and GDG Songdo.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
367 |
## Author
|
368 |
|
369 |
[JUNE LEE](https://github.com/melodysdreamj) - He is active in Songdo Artificial Intelligence Study and GDG Songdo.
|
370 |
+
|
371 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
372 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__wizard-vicuna-13B-GPTQ)
|
373 |
+
|
374 |
+
| Metric | Value |
|
375 |
+
|-----------------------|---------------------------|
|
376 |
+
| Avg. | 32.08 |
|
377 |
+
| ARC (25-shot) | 28.67 |
|
378 |
+
| HellaSwag (10-shot) | 25.94 |
|
379 |
+
| MMLU (5-shot) | 25.84 |
|
380 |
+
| TruthfulQA (0-shot) | 48.53 |
|
381 |
+
| Winogrande (5-shot) | 74.74 |
|
382 |
+
| GSM8K (5-shot) | 9.63 |
|
383 |
+
| DROP (3-shot) | 11.21 |
|