Taishi-N324
commited on
Commit
•
e9efbf2
1
Parent(s):
8abaccb
Update README.md
Browse files
README.md
CHANGED
@@ -41,12 +41,10 @@ This repository provides large language models developed by [Swallow-LLM](https:
|
|
41 |
|
42 |
### Japanese tasks
|
43 |
|
44 |
-
|Model|
|
45 |
-
|
46 |
-
| |
|
47 |
| | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
48 |
-
| モデル名 | JCom. | JEMHopQA | NIILC | JSQuAD | XL-Sum | MGSM | WMT20-en-ja | WMT20-ja-en | JMMLU | JHumanEval | Ja Avg |
|
49 |
-
|----------|-------|----------|-------|--------|--------|------|-------------|-------------|-------|------------|--------|
|
50 |
| Gemma 2 27B IT | 0.9562 | 0.5413 | 0.5755 | 0.8832 | 0.1648 | 0.7000 | 0.2900 | 0.2500 | 0.6701 | 0.6293 | 0.5660 |
|
51 |
| Phi-3.5-MoE Instruct | 0.9321 | 0.4416 | 0.4920 | 0.9079 | 0.2255 | 0.7120 | 0.2575 | 0.2024 | 0.6447 | 0.4213 | 0.5237 |
|
52 |
| GRIN-MoE | 0.8606 | 0.4622 | 0.3943 | 0.8877 | 0.0302 | 0.6400 | 0.2300 | 0.1911 | 0.5696 | 0.4476 | 0.4713 |
|
@@ -65,12 +63,10 @@ This repository provides large language models developed by [Swallow-LLM](https:
|
|
65 |
|
66 |
### English tasks
|
67 |
|
68 |
-
|Model|
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
| モデル名 | OpenBookQA | TriviaQA | HellaSwag | SQuAD2.0 | XWIN | MMLU | GSM8K | BBH | HumanEval | En Avg |
|
73 |
-
|----------|------------|----------|------------|----------|------|------|-------|-----|-----------|--------|
|
74 |
| Gemma 2 27B IT | 0.4560 | 0.7660 | 0.6548 | 0.4012 | 0.9101 | 0.7624 | 0.8438 | 0.7876 | 0.6939 | 0.6973 |
|
75 |
| Phi-3.5-MoE Instruct | 0.4960 | 0.6746 | 0.6901 | 0.3174 | 0.8903 | 0.7872 | 0.8317 | 0.7618 | 0.5561 | 0.6673 |
|
76 |
| GRIN-MoE | 0.4660 | 0.7035 | 0.7046 | 0.3544 | 0.8976 | 0.7693 | 0.8287 | 0.7533 | 0.6841 | 0.6846 |
|
@@ -89,10 +85,9 @@ This repository provides large language models developed by [Swallow-LLM](https:
|
|
89 |
|
90 |
## MT-Bench JA
|
91 |
|
92 |
-
|Model|
|
93 |
-
|
94 |
| Model | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | JMT Avg |
|
95 |
-
|-------|--------|------------|------------|------|-----------|----------|------|---------|---------|
|
96 |
| Gemma 2 27B IT | 0.5467 | 0.6752 | 0.8386 | 0.6246 | 0.7201 | 0.7916 | 0.6787 | 0.807 | 0.7103 |
|
97 |
| Phi-3.5-MoE Instruct | 0.5214 | 0.8106 | 0.647 | 0.4415 | 0.536 | 0.6712 | 0.5314 | 0.7304 | 0.6112 |
|
98 |
| GRIN-MoE | 0.5294 | 0.7224 | 0.5923 | 0.5467 | 0.499 | 0.603 | 0.538 | 0.6839 | 0.5893 |
|
|
|
41 |
|
42 |
### Japanese tasks
|
43 |
|
44 |
+
|Model|JCom.|JEMHopQA|NIILC|JSQuAD|XL-Sum|MGSM|WMT20-en-ja|WMT20-ja-en|JMMLU|JHumanEval|Ja Avg|
|
45 |
+
|---|---|---|---|---|---|---|---|---|---|---|---|
|
46 |
+
| |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
|
47 |
| | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
|
|
|
|
48 |
| Gemma 2 27B IT | 0.9562 | 0.5413 | 0.5755 | 0.8832 | 0.1648 | 0.7000 | 0.2900 | 0.2500 | 0.6701 | 0.6293 | 0.5660 |
|
49 |
| Phi-3.5-MoE Instruct | 0.9321 | 0.4416 | 0.4920 | 0.9079 | 0.2255 | 0.7120 | 0.2575 | 0.2024 | 0.6447 | 0.4213 | 0.5237 |
|
50 |
| GRIN-MoE | 0.8606 | 0.4622 | 0.3943 | 0.8877 | 0.0302 | 0.6400 | 0.2300 | 0.1911 | 0.5696 | 0.4476 | 0.4713 |
|
|
|
63 |
|
64 |
### English tasks
|
65 |
|
66 |
+
|Model|OpenBookQA|TriviaQA|HellaSWAG|SQuAD2.0|XWINO|MMLU|GSM8K|BBH|HumanEval|EnAvg|
|
67 |
+
|---|---|---|---|---|---|---|---|---|---|---|
|
68 |
+
||4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot||
|
69 |
+
||Acc|EMacc|Acc|EMacc|Acc|Acc|EMacc|CoTEMAcc|pass@1||
|
|
|
|
|
70 |
| Gemma 2 27B IT | 0.4560 | 0.7660 | 0.6548 | 0.4012 | 0.9101 | 0.7624 | 0.8438 | 0.7876 | 0.6939 | 0.6973 |
|
71 |
| Phi-3.5-MoE Instruct | 0.4960 | 0.6746 | 0.6901 | 0.3174 | 0.8903 | 0.7872 | 0.8317 | 0.7618 | 0.5561 | 0.6673 |
|
72 |
| GRIN-MoE | 0.4660 | 0.7035 | 0.7046 | 0.3544 | 0.8976 | 0.7693 | 0.8287 | 0.7533 | 0.6841 | 0.6846 |
|
|
|
85 |
|
86 |
## MT-Bench JA
|
87 |
|
88 |
+
|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
|
89 |
+
|---|---|---|---|---|---|---|---|---|---|
|
90 |
| Model | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | JMT Avg |
|
|
|
91 |
| Gemma 2 27B IT | 0.5467 | 0.6752 | 0.8386 | 0.6246 | 0.7201 | 0.7916 | 0.6787 | 0.807 | 0.7103 |
|
92 |
| Phi-3.5-MoE Instruct | 0.5214 | 0.8106 | 0.647 | 0.4415 | 0.536 | 0.6712 | 0.5314 | 0.7304 | 0.6112 |
|
93 |
| GRIN-MoE | 0.5294 | 0.7224 | 0.5923 | 0.5467 | 0.499 | 0.603 | 0.538 | 0.6839 | 0.5893 |
|