fedric95 commited on
Commit
4127b2b
·
verified ·
1 Parent(s): 5c724ab

Upload ./Qwen2-7B-Q3_K_M.mmlu.pro.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. Qwen2-7B-Q3_K_M.mmlu.pro.txt +2 -80
Qwen2-7B-Q3_K_M.mmlu.pro.txt CHANGED
@@ -1,80 +1,2 @@
1
- multiple_choice_score: there are 70 tasks in prompt
2
- multiple_choice_score: reading tasks......................................................................done
3
- multiple_choice_score: preparing task data......................................................................done
4
- multiple_choice_score : calculating TruthfulQA score over 70 tasks.
5
-
6
- task acc_norm
7
- 1 0.00000000
8
- 2 50.00000000
9
- 3 33.33333333
10
- 4 25.00000000
11
- 5 40.00000000
12
- 6 33.33333333
13
- 7 28.57142857
14
- 8 25.00000000
15
- 9 22.22222222
16
- 10 30.00000000
17
- 11 36.36363636
18
- 12 41.66666667
19
- 13 38.46153846
20
- 14 42.85714286
21
- 15 46.66666667
22
- 16 43.75000000
23
- 17 41.17647059
24
- 18 38.88888889
25
- 19 36.84210526
26
- 20 35.00000000
27
- 21 33.33333333
28
- 22 31.81818182
29
- 23 30.43478261
30
- 24 29.16666667
31
- 25 28.00000000
32
- 26 26.92307692
33
- 27 25.92592593
34
- 28 25.00000000
35
- 29 24.13793103
36
- 30 23.33333333
37
- 31 22.58064516
38
- 32 21.87500000
39
- 33 24.24242424
40
- 34 23.52941176
41
- 35 22.85714286
42
- 36 22.22222222
43
- 37 21.62162162
44
- 38 21.05263158
45
- 39 20.51282051
46
- 40 22.50000000
47
- 41 21.95121951
48
- 42 21.42857143
49
- 43 20.93023256
50
- 44 20.45454545
51
- 45 20.00000000
52
- 46 21.73913043
53
- 47 21.27659574
54
- 48 22.91666667
55
- 49 22.44897959
56
- 50 22.00000000
57
- 51 21.56862745
58
- 52 21.15384615
59
- 53 22.64150943
60
- 54 22.22222222
61
- 55 23.63636364
62
- 56 23.21428571
63
- 57 22.80701754
64
- 58 22.41379310
65
- 59 22.03389831
66
- 60 23.33333333
67
- 61 22.95081967
68
- 62 22.58064516
69
- 63 22.22222222
70
- 64 23.43750000
71
- 65 23.07692308
72
- 66 22.72727273
73
- 67 22.38805970
74
- 68 22.05882353
75
- 69 21.73913043
76
- 70 21.42857143
77
-
78
- Final result: 21.4286 +/- 4.9397
79
- Random chance: 10.0000 +/- 3.6116
80
-
 
1
+ multiple_choice_score: there are 12032 tasks in prompt
2
+ multiple_choice_score: reading tasksmultiple_choice_score: failed to read task 1 of 12032