leaderboard-pr-bot commited on
Commit
0a64752
1 Parent(s): c16867f

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +110 -1
README.md CHANGED
@@ -1,10 +1,105 @@
1
  ---
2
- library_name: transformers
3
  license: apache-2.0
 
4
  tags:
5
  - jamba
6
  - mamba
7
  - moe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
  # Model Card for Jamba
@@ -165,3 +260,17 @@ As a base model, Jamba is intended for use as a foundation layer for fine tuning
165
  AI21 builds reliable, practical, and scalable AI solutions for the enterprise.
166
 
167
  Jamba is the first in AI21’s new family of models, and the Instruct version of Jamba is coming soon to the [AI21 platform](https://www.ai21.com/studio).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: apache-2.0
3
+ library_name: transformers
4
  tags:
5
  - jamba
6
  - mamba
7
  - moe
8
+ model-index:
9
+ - name: Jamba-v0.1
10
+ results:
11
+ - task:
12
+ type: text-generation
13
+ name: Text Generation
14
+ dataset:
15
+ name: IFEval (0-Shot)
16
+ type: HuggingFaceH4/ifeval
17
+ args:
18
+ num_few_shot: 0
19
+ metrics:
20
+ - type: inst_level_strict_acc and prompt_level_strict_acc
21
+ value: 20.26
22
+ name: strict accuracy
23
+ source:
24
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
25
+ name: Open LLM Leaderboard
26
+ - task:
27
+ type: text-generation
28
+ name: Text Generation
29
+ dataset:
30
+ name: BBH (3-Shot)
31
+ type: BBH
32
+ args:
33
+ num_few_shot: 3
34
+ metrics:
35
+ - type: acc_norm
36
+ value: 10.72
37
+ name: normalized accuracy
38
+ source:
39
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
40
+ name: Open LLM Leaderboard
41
+ - task:
42
+ type: text-generation
43
+ name: Text Generation
44
+ dataset:
45
+ name: MATH Lvl 5 (4-Shot)
46
+ type: hendrycks/competition_math
47
+ args:
48
+ num_few_shot: 4
49
+ metrics:
50
+ - type: exact_match
51
+ value: 0.98
52
+ name: exact match
53
+ source:
54
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
55
+ name: Open LLM Leaderboard
56
+ - task:
57
+ type: text-generation
58
+ name: Text Generation
59
+ dataset:
60
+ name: GPQA (0-shot)
61
+ type: Idavidrein/gpqa
62
+ args:
63
+ num_few_shot: 0
64
+ metrics:
65
+ - type: acc_norm
66
+ value: 2.46
67
+ name: acc_norm
68
+ source:
69
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
70
+ name: Open LLM Leaderboard
71
+ - task:
72
+ type: text-generation
73
+ name: Text Generation
74
+ dataset:
75
+ name: MuSR (0-shot)
76
+ type: TAUR-Lab/MuSR
77
+ args:
78
+ num_few_shot: 0
79
+ metrics:
80
+ - type: acc_norm
81
+ value: 3.71
82
+ name: acc_norm
83
+ source:
84
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: MMLU-PRO (5-shot)
91
+ type: TIGER-Lab/MMLU-Pro
92
+ config: main
93
+ split: test
94
+ args:
95
+ num_few_shot: 5
96
+ metrics:
97
+ - type: acc
98
+ value: 16.45
99
+ name: accuracy
100
+ source:
101
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ai21labs/Jamba-v0.1
102
+ name: Open LLM Leaderboard
103
  ---
104
 
105
  # Model Card for Jamba
 
260
  AI21 builds reliable, practical, and scalable AI solutions for the enterprise.
261
 
262
  Jamba is the first in AI21’s new family of models, and the Instruct version of Jamba is coming soon to the [AI21 platform](https://www.ai21.com/studio).
263
+
264
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
265
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ai21labs__Jamba-v0.1)
266
+
267
+ | Metric |Value|
268
+ |-------------------|----:|
269
+ |Avg. | 9.10|
270
+ |IFEval (0-Shot) |20.26|
271
+ |BBH (3-Shot) |10.72|
272
+ |MATH Lvl 5 (4-Shot)| 0.98|
273
+ |GPQA (0-shot) | 2.46|
274
+ |MuSR (0-shot) | 3.71|
275
+ |MMLU-PRO (5-shot) |16.45|
276
+