Taishi-N324 commited on
Commit
e9efbf2
1 Parent(s): 8abaccb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -14
README.md CHANGED
@@ -41,12 +41,10 @@ This repository provides large language models developed by [Swallow-LLM](https:
41
 
42
  ### Japanese tasks
43
 
44
- |Model|Size|JCom.|JEMHopQA|NIILC|JSQuAD|XL-Sum|MGSM|WMT20-en-ja|WMT20-ja-en|JMMLU|JHumanEval|Ja Avg|
45
- |---|---|---|---|---|---|---|---|---|---|---|---|---|
46
- | | |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
47
  | | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
48
- | モデル名 | JCom. | JEMHopQA | NIILC | JSQuAD | XL-Sum | MGSM | WMT20-en-ja | WMT20-ja-en | JMMLU | JHumanEval | Ja Avg |
49
- |----------|-------|----------|-------|--------|--------|------|-------------|-------------|-------|------------|--------|
50
  | Gemma 2 27B IT | 0.9562 | 0.5413 | 0.5755 | 0.8832 | 0.1648 | 0.7000 | 0.2900 | 0.2500 | 0.6701 | 0.6293 | 0.5660 |
51
  | Phi-3.5-MoE Instruct | 0.9321 | 0.4416 | 0.4920 | 0.9079 | 0.2255 | 0.7120 | 0.2575 | 0.2024 | 0.6447 | 0.4213 | 0.5237 |
52
  | GRIN-MoE | 0.8606 | 0.4622 | 0.3943 | 0.8877 | 0.0302 | 0.6400 | 0.2300 | 0.1911 | 0.5696 | 0.4476 | 0.4713 |
@@ -65,12 +63,10 @@ This repository provides large language models developed by [Swallow-LLM](https:
65
 
66
  ### English tasks
67
 
68
- |Model|Size|OpenBookQA|TriviaQA|HellaSWAG|SQuAD2.0|XWINO|MMLU|GSM8K|BBH|HumanEval|EnAvg|
69
- |---|---|---|---|---|---|---|---|---|---|---|---|
70
- |||4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot||
71
- |||Acc|EMacc|Acc|EMacc|Acc|Acc|EMacc|CoTEMAcc|pass@1||
72
- | モデル名 | OpenBookQA | TriviaQA | HellaSwag | SQuAD2.0 | XWIN | MMLU | GSM8K | BBH | HumanEval | En Avg |
73
- |----------|------------|----------|------------|----------|------|------|-------|-----|-----------|--------|
74
  | Gemma 2 27B IT | 0.4560 | 0.7660 | 0.6548 | 0.4012 | 0.9101 | 0.7624 | 0.8438 | 0.7876 | 0.6939 | 0.6973 |
75
  | Phi-3.5-MoE Instruct | 0.4960 | 0.6746 | 0.6901 | 0.3174 | 0.8903 | 0.7872 | 0.8317 | 0.7618 | 0.5561 | 0.6673 |
76
  | GRIN-MoE | 0.4660 | 0.7035 | 0.7046 | 0.3544 | 0.8976 | 0.7693 | 0.8287 | 0.7533 | 0.6841 | 0.6846 |
@@ -89,10 +85,9 @@ This repository provides large language models developed by [Swallow-LLM](https:
89
 
90
  ## MT-Bench JA
91
 
92
- |Model|Size|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
93
- |---|---|---|---|---|---|---|---|---|---|---|
94
  | Model | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | JMT Avg |
95
- |-------|--------|------------|------------|------|-----------|----------|------|---------|---------|
96
  | Gemma 2 27B IT | 0.5467 | 0.6752 | 0.8386 | 0.6246 | 0.7201 | 0.7916 | 0.6787 | 0.807 | 0.7103 |
97
  | Phi-3.5-MoE Instruct | 0.5214 | 0.8106 | 0.647 | 0.4415 | 0.536 | 0.6712 | 0.5314 | 0.7304 | 0.6112 |
98
  | GRIN-MoE | 0.5294 | 0.7224 | 0.5923 | 0.5467 | 0.499 | 0.603 | 0.538 | 0.6839 | 0.5893 |
 
41
 
42
  ### Japanese tasks
43
 
44
+ |Model|JCom.|JEMHopQA|NIILC|JSQuAD|XL-Sum|MGSM|WMT20-en-ja|WMT20-ja-en|JMMLU|JHumanEval|Ja Avg|
45
+ |---|---|---|---|---|---|---|---|---|---|---|---|
46
+ | |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
47
  | | |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
 
 
48
  | Gemma 2 27B IT | 0.9562 | 0.5413 | 0.5755 | 0.8832 | 0.1648 | 0.7000 | 0.2900 | 0.2500 | 0.6701 | 0.6293 | 0.5660 |
49
  | Phi-3.5-MoE Instruct | 0.9321 | 0.4416 | 0.4920 | 0.9079 | 0.2255 | 0.7120 | 0.2575 | 0.2024 | 0.6447 | 0.4213 | 0.5237 |
50
  | GRIN-MoE | 0.8606 | 0.4622 | 0.3943 | 0.8877 | 0.0302 | 0.6400 | 0.2300 | 0.1911 | 0.5696 | 0.4476 | 0.4713 |
 
63
 
64
  ### English tasks
65
 
66
+ |Model|OpenBookQA|TriviaQA|HellaSWAG|SQuAD2.0|XWINO|MMLU|GSM8K|BBH|HumanEval|EnAvg|
67
+ |---|---|---|---|---|---|---|---|---|---|---|
68
+ ||4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot||
69
+ ||Acc|EMacc|Acc|EMacc|Acc|Acc|EMacc|CoTEMAcc|pass@1||
 
 
70
  | Gemma 2 27B IT | 0.4560 | 0.7660 | 0.6548 | 0.4012 | 0.9101 | 0.7624 | 0.8438 | 0.7876 | 0.6939 | 0.6973 |
71
  | Phi-3.5-MoE Instruct | 0.4960 | 0.6746 | 0.6901 | 0.3174 | 0.8903 | 0.7872 | 0.8317 | 0.7618 | 0.5561 | 0.6673 |
72
  | GRIN-MoE | 0.4660 | 0.7035 | 0.7046 | 0.3544 | 0.8976 | 0.7693 | 0.8287 | 0.7533 | 0.6841 | 0.6846 |
 
85
 
86
  ## MT-Bench JA
87
 
88
+ |Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
89
+ |---|---|---|---|---|---|---|---|---|---|
90
  | Model | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | JMT Avg |
 
91
  | Gemma 2 27B IT | 0.5467 | 0.6752 | 0.8386 | 0.6246 | 0.7201 | 0.7916 | 0.6787 | 0.807 | 0.7103 |
92
  | Phi-3.5-MoE Instruct | 0.5214 | 0.8106 | 0.647 | 0.4415 | 0.536 | 0.6712 | 0.5314 | 0.7304 | 0.6112 |
93
  | GRIN-MoE | 0.5294 | 0.7224 | 0.5923 | 0.5467 | 0.499 | 0.603 | 0.538 | 0.6839 | 0.5893 |