Upload README.md
Browse files
README.md
CHANGED
@@ -34,15 +34,33 @@ Github: [Coming soon...]
|
|
34 |
|
35 |
# **Model Benchmark**
|
36 |
|
37 |
-
## Open leaderboard
|
38 |
-
|
39 |
-
|
40 |
| Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | CommonGen-V2 |
|
41 |
| --- | --- | --- | --- | --- | --- | --- |
|
|
|
|
|
|
|
42 |
| **Ko-PlatYi-6B** | NaN | NaN | NaN | NaN | NaN | NaN |
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
|
|
45 |
# Implementation Code
|
|
|
46 |
```python
|
47 |
### KO-Platypus
|
48 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
34 |
|
35 |
# **Model Benchmark**
|
36 |
|
37 |
+
## Open leaderboard
|
38 |
+
> Follow up as [link](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).
|
39 |
+
|
40 |
| Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | CommonGen-V2 |
|
41 |
| --- | --- | --- | --- | --- | --- | --- |
|
42 |
+
| Ko-PlatYi-6B-O | NaN | NaN | NaN | NaN | NaN | NaN |
|
43 |
+
| Ko-PlatYi-6B-kiwi | NaN | NaN | NaN | NaN | NaN | NaN |
|
44 |
+
| Ko-PlatYi-6B-gu | NaN | NaN | NaN | NaN | NaN | NaN |
|
45 |
| **Ko-PlatYi-6B** | NaN | NaN | NaN | NaN | NaN | NaN |
|
46 |
+
| Yi-Ko-6B | 48.79 | 41.04 | 53.39 | 46.28 | 41.64 | 61.63 |
|
47 |
+
|
48 |
+
---
|
49 |
+
## AI-Harness Evaluation
|
50 |
+
> AI-Harness evaluation; [link](https://github.com/Beomi/ko-lm-evaluation-harness)
|
51 |
+
|
52 |
+
| Model | BoolQ | Copa | HellaSwag | Sentineg |
|
53 |
+
| --- | --- | --- | --- | --- |
|
54 |
+
| | *Zero-shot* ||||
|
55 |
+
| Ko-PlatYi-6B-O | 0.3343 | 0.7687 | 0.4833 | 0.5794 |
|
56 |
+
| Ko-PlatYi-6B-kiwi | 0.3343 | 0.7665 | 0.4746 | **0.6248** |
|
57 |
+
| Ko-PlatYi-6B-gu | **0.7077** | **0.7696** | 0.4797 | 0.3979 |
|
58 |
+
| **Ko-PlatYi-6B** | 0.3343 | 0.7684 | **0.4917** | 0.5226 |
|
59 |
+
| Yi-Ko-6B | **0.7070** | 0.7696 | **0.5009** | 0.4044 |
|
60 |
|
61 |
+
---
|
62 |
# Implementation Code
|
63 |
+
|
64 |
```python
|
65 |
### KO-Platypus
|
66 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|