Files changed (1) hide show
  1. README.md +110 -2
README.md CHANGED
@@ -1,7 +1,102 @@
1
  ---
2
  license: apache-2.0
3
- pipeline_tag: text-generation
4
  library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
6
  TrainOutput(global_step=180, training_loss=0.06880439353962547, metrics={'train_runtime': 1644.0348, 'train_samples_per_second': 1.749, 'train_steps_per_second': 0.109, 'total_flos': 0.0, 'train_loss': 0.06880439353962547, 'epoch': 0.5006954102920723})
7
  ---
@@ -26,4 +121,17 @@ tags:
26
 
27
  This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
28
 
29
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
 
3
  library_name: transformers
4
+ pipeline_tag: text-generation
5
+ model-index:
6
+ - name: zephyr-sft-bnb-4bit-DPO-mtbo-180steps
7
+ results:
8
+ - task:
9
+ type: text-generation
10
+ name: Text Generation
11
+ dataset:
12
+ name: IFEval (0-Shot)
13
+ type: HuggingFaceH4/ifeval
14
+ args:
15
+ num_few_shot: 0
16
+ metrics:
17
+ - type: inst_level_strict_acc and prompt_level_strict_acc
18
+ value: 40.87
19
+ name: strict accuracy
20
+ source:
21
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
22
+ name: Open LLM Leaderboard
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: BBH (3-Shot)
28
+ type: BBH
29
+ args:
30
+ num_few_shot: 3
31
+ metrics:
32
+ - type: acc_norm
33
+ value: 21.35
34
+ name: normalized accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
37
+ name: Open LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: MATH Lvl 5 (4-Shot)
43
+ type: hendrycks/competition_math
44
+ args:
45
+ num_few_shot: 4
46
+ metrics:
47
+ - type: exact_match
48
+ value: 1.81
49
+ name: exact match
50
+ source:
51
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: GPQA (0-shot)
58
+ type: Idavidrein/gpqa
59
+ args:
60
+ num_few_shot: 0
61
+ metrics:
62
+ - type: acc_norm
63
+ value: 3.47
64
+ name: acc_norm
65
+ source:
66
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
67
+ name: Open LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: MuSR (0-shot)
73
+ type: TAUR-Lab/MuSR
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: acc_norm
78
+ value: 6.16
79
+ name: acc_norm
80
+ source:
81
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
82
+ name: Open LLM Leaderboard
83
+ - task:
84
+ type: text-generation
85
+ name: Text Generation
86
+ dataset:
87
+ name: MMLU-PRO (5-shot)
88
+ type: TIGER-Lab/MMLU-Pro
89
+ config: main
90
+ split: test
91
+ args:
92
+ num_few_shot: 5
93
+ metrics:
94
+ - type: acc
95
+ value: 19.42
96
+ name: accuracy
97
+ source:
98
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sonthenguyen/zephyr-sft-bnb-4bit-DPO-mtbo-180steps
99
+ name: Open LLM Leaderboard
100
  ---
101
  TrainOutput(global_step=180, training_loss=0.06880439353962547, metrics={'train_runtime': 1644.0348, 'train_samples_per_second': 1.749, 'train_steps_per_second': 0.109, 'total_flos': 0.0, 'train_loss': 0.06880439353962547, 'epoch': 0.5006954102920723})
102
  ---
 
121
 
122
  This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
123
 
124
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
125
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
126
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_sonthenguyen__zephyr-sft-bnb-4bit-DPO-mtbo-180steps)
127
+
128
+ | Metric |Value|
129
+ |-------------------|----:|
130
+ |Avg. |15.51|
131
+ |IFEval (0-Shot) |40.87|
132
+ |BBH (3-Shot) |21.35|
133
+ |MATH Lvl 5 (4-Shot)| 1.81|
134
+ |GPQA (0-shot) | 3.47|
135
+ |MuSR (0-shot) | 6.16|
136
+ |MMLU-PRO (5-shot) |19.42|
137
+