Llama-Guard-3-8B-GGUF / scores /Llama-Guard-3-8B-Q4_K_M-naive.mmlu
eaddario's picture
Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores
6de6927 verified
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.
task acc_norm
1 100.00000000
2 50.00000000
3 33.33333333
4 50.00000000
5 40.00000000
6 33.33333333
7 42.85714286
8 50.00000000
9 55.55555556
10 60.00000000
11 54.54545455
12 58.33333333
13 53.84615385
14 50.00000000
15 53.33333333
16 56.25000000
17 58.82352941
18 55.55555556
19 57.89473684
20 60.00000000
21 61.90476190
22 59.09090909
23 56.52173913
24 58.33333333
25 60.00000000
26 57.69230769
27 59.25925926
28 57.14285714
29 58.62068966
30 60.00000000
31 61.29032258
32 62.50000000
33 60.60606061
34 61.76470588
35 62.85714286
36 61.11111111
37 59.45945946
38 60.52631579
39 58.97435897
40 57.50000000
41 56.09756098
42 54.76190476
43 55.81395349
44 54.54545455
45 55.55555556
46 54.34782609
47 53.19148936
48 54.16666667
49 55.10204082
50 54.00000000
51 52.94117647
52 51.92307692
53 50.94339623
54 50.00000000
55 50.90909091
56 51.78571429
57 50.87719298
58 50.00000000
59 49.15254237
60 48.33333333
61 49.18032787
62 48.38709677
63 47.61904762
64 46.87500000
65 46.15384615
66 46.96969697
67 46.26865672
68 45.58823529
69 44.92753623
70 45.71428571
71 46.47887324
72 45.83333333
73 45.20547945
74 45.94594595
75 46.66666667
76 47.36842105
77 48.05194805
78 48.71794872
79 48.10126582
80 48.75000000
81 49.38271605
82 48.78048780
83 49.39759036
84 50.00000000
85 49.41176471
86 48.83720930
87 49.42528736
88 50.00000000
89 49.43820225
90 48.88888889
91 48.35164835
92 47.82608696
93 47.31182796
94 47.87234043
95 47.36842105
96 47.91666667
97 47.42268041
98 46.93877551
99 46.46464646
100 46.00000000
101 46.53465347
102 46.07843137
103 45.63106796
104 46.15384615
105 46.66666667
106 46.22641509
107 45.79439252
108 46.29629630
109 46.78899083
110 47.27272727
111 47.74774775
112 47.32142857
113 46.90265487
114 47.36842105
115 47.82608696
116 47.41379310
117 47.86324786
118 48.30508475
119 47.89915966
120 47.50000000
121 47.93388430
122 47.54098361
123 47.15447154
124 46.77419355
125 47.20000000
126 46.82539683
127 46.45669291
128 46.09375000
129 45.73643411
130 45.38461538
131 45.03816794
132 44.69696970
133 44.36090226
134 44.02985075
135 43.70370370
136 43.38235294
137 43.79562044
138 43.47826087
139 43.88489209
140 43.57142857
141 43.26241135
142 43.66197183
143 44.05594406
144 44.44444444
145 44.82758621
146 45.20547945
147 44.89795918
148 45.27027027
149 44.96644295
150 44.66666667
151 44.37086093
152 44.73684211
153 45.09803922
154 44.80519481
155 44.51612903
156 44.23076923
157 44.58598726
158 44.93670886
159 44.65408805
160 45.00000000
161 44.72049689
162 44.44444444
163 44.17177914
164 43.90243902
165 44.24242424
166 43.97590361
167 44.31137725
168 44.04761905
169 44.37869822
170 44.11764706
171 44.44444444
172 44.18604651
173 43.93063584
174 43.67816092
175 43.42857143
176 43.18181818
177 42.93785311
178 42.69662921
179 43.01675978
180 43.33333333
181 43.64640884
182 43.40659341
183 43.71584699
184 43.47826087
185 43.78378378
186 44.08602151
187 43.85026738
188 44.14893617
189 44.44444444
190 44.21052632
191 44.50261780
192 44.27083333
193 44.04145078
194 44.32989691
195 44.61538462
196 44.89795918
197 44.67005076
198 44.44444444
199 44.22110553
200 44.00000000
201 43.78109453
202 43.56435644
203 43.34975369
204 43.13725490
205 43.41463415
206 43.20388350
207 43.47826087
208 43.26923077
209 43.06220096
210 42.85714286
211 43.12796209
212 42.92452830
213 42.72300469
214 42.52336449
215 42.79069767
216 43.05555556
217 42.85714286
218 43.11926606
219 43.37899543
220 43.63636364
221 43.43891403
222 43.69369369
223 43.94618834
224 43.75000000
225 44.00000000
226 43.80530973
227 43.61233480
228 43.85964912
229 43.66812227
230 43.47826087
231 43.29004329
232 43.53448276
233 43.34763948
234 43.58974359
235 43.40425532
236 43.22033898
237 43.03797468
238 42.85714286
239 42.67782427
240 42.91666667
241 42.73858921
242 42.97520661
243 43.20987654
244 43.03278689
245 43.26530612
246 43.08943089
247 42.91497976
248 43.14516129
249 42.97188755
250 42.80000000
251 42.62948207
252 42.46031746
253 42.29249012
254 42.12598425
255 41.96078431
256 42.18750000
257 42.41245136
258 42.24806202
259 42.08494208
260 42.30769231
261 42.14559387
262 41.98473282
263 41.82509506
264 41.66666667
265 41.88679245
266 42.10526316
267 41.94756554
268 41.79104478
269 42.00743494
270 41.85185185
271 41.69741697
272 41.91176471
273 41.75824176
274 41.60583942
275 41.45454545
276 41.66666667
277 41.87725632
278 41.72661871
279 41.57706093
280 41.78571429
281 41.63701068
282 41.84397163
283 41.69611307
284 41.54929577
285 41.40350877
286 41.25874126
287 41.46341463
288 41.31944444
289 41.17647059
290 41.37931034
291 41.58075601
292 41.78082192
293 41.63822526
294 41.49659864
295 41.35593220
296 41.21621622
297 41.07744108
298 41.27516779
299 41.13712375
300 41.33333333
301 41.19601329
302 41.05960265
303 40.92409241
304 40.78947368
305 40.98360656
306 40.84967320
307 41.04234528
308 40.90909091
309 40.77669903
310 40.64516129
311 40.51446945
312 40.70512821
313 40.57507987
314 40.44585987
315 40.31746032
316 40.18987342
317 40.37854890
318 40.56603774
319 40.75235110
320 40.62500000
321 40.49844237
322 40.37267081
323 40.24767802
324 40.12345679
325 40.30769231
326 40.18404908
327 40.36697248
328 40.24390244
329 40.12158055
330 40.00000000
331 39.87915408
332 40.06024096
333 40.24024024
334 40.11976048
335 40.29850746
336 40.47619048
337 40.35608309
338 40.23668639
339 40.11799410
340 40.29411765
341 40.46920821
342 40.64327485
343 40.52478134
344 40.40697674
345 40.28985507
346 40.46242775
347 40.63400576
348 40.51724138
349 40.68767908
350 40.85714286
351 40.74074074
352 40.62500000
353 40.79320113
354 40.67796610
355 40.56338028
356 40.73033708
357 40.89635854
358 40.78212291
359 40.66852368
360 40.83333333
361 40.72022161
362 40.88397790
363 41.04683196
364 40.93406593
365 40.82191781
366 40.98360656
367 40.87193460
368 40.76086957
369 40.65040650
370 40.81081081
371 40.70080863
372 40.59139785
373 40.48257373
374 40.37433155
375 40.53333333
376 40.42553191
377 40.31830239
378 40.47619048
379 40.36939314
380 40.26315789
381 40.41994751
382 40.57591623
383 40.73107050
384 40.62500000
385 40.77922078
386 40.67357513
387 40.56847545
388 40.46391753
389 40.61696658
390 40.76923077
391 40.92071611
392 41.07142857
393 40.96692112
394 40.86294416
395 41.01265823
396 41.16161616
397 41.05793451
398 40.95477387
399 41.10275689
400 41.00000000
401 41.14713217
402 41.04477612
403 40.94292804
404 40.84158416
405 40.74074074
406 40.64039409
407 40.54054054
408 40.44117647
409 40.58679707
410 40.48780488
411 40.38929440
412 40.29126214
413 40.19370460
414 40.09661836
415 40.00000000
416 40.14423077
417 40.04796163
418 39.95215311
419 40.09546539
420 40.00000000
421 40.14251781
422 40.04739336
423 39.95271868
424 40.09433962
425 40.00000000
426 39.90610329
427 40.04683841
428 39.95327103
429 40.09324009
430 40.00000000
431 40.13921114
432 40.04629630
433 39.95381062
434 39.86175115
435 39.77011494
436 39.67889908
437 39.58810069
438 39.72602740
439 39.86332574
440 39.77272727
441 39.90929705
442 39.81900452
443 39.72911964
444 39.63963964
445 39.55056180
446 39.46188341
447 39.37360179
448 39.28571429
449 39.19821826
450 39.33333333
451 39.46784922
452 39.38053097
453 39.51434879
454 39.42731278
455 39.34065934
456 39.47368421
457 39.38730853
458 39.30131004
459 39.43355120
460 39.56521739
461 39.47939262
462 39.61038961
463 39.52483801
464 39.65517241
465 39.56989247
466 39.48497854
467 39.61456103
468 39.74358974
469 39.87206823
470 39.78723404
471 39.70276008
472 39.61864407
473 39.74630021
474 39.66244726
475 39.57894737
476 39.70588235
477 39.62264151
478 39.74895397
479 39.87473904
480 40.00000000
481 39.91683992
482 39.83402490
483 39.75155280
484 39.66942149
485 39.58762887
486 39.71193416
487 39.83572895
488 39.75409836
489 39.67280164
490 39.79591837
491 39.71486762
492 39.63414634
493 39.55375254
494 39.67611336
495 39.79797980
496 39.91935484
497 39.83903421
498 39.75903614
499 39.67935872
500 39.80000000
501 39.72055888
502 39.64143426
503 39.76143141
504 39.68253968
505 39.60396040
506 39.72332016
507 39.64497041
508 39.56692913
509 39.68565815
510 39.80392157
511 39.72602740
512 39.64843750
513 39.57115010
514 39.49416342
515 39.61165049
516 39.72868217
517 39.84526112
518 39.96138996
519 39.88439306
520 39.80769231
521 39.73128599
522 39.84674330
523 39.77055449
524 39.88549618
525 39.80952381
526 39.73384030
527 39.84819734
528 39.96212121
529 39.88657845
530 40.00000000
531 39.92467043
532 39.84962406
533 39.77485929
534 39.70037453
535 39.81308411
536 39.73880597
537 39.85102421
538 39.96282528
539 39.88868275
540 39.81481481
541 39.74121996
542 39.85239852
543 39.77900552
544 39.88970588
545 39.81651376
546 39.74358974
547 39.85374771
548 39.96350365
549 39.89071038
550 39.81818182
551 39.92740472
552 39.85507246
553 39.78300181
554 39.71119134
555 39.63963964
556 39.74820144
557 39.67684022
558 39.78494624
559 39.89266547
560 40.00000000
561 39.92869875
562 40.03558719
563 40.14209591
564 40.07092199
565 40.17699115
566 40.10600707
567 40.03527337
568 40.14084507
569 40.07029877
570 40.00000000
571 39.92994746
572 39.86013986
573 39.79057592
574 39.72125436
575 39.65217391
576 39.75694444
577 39.86135182
578 39.96539792
579 39.89637306
580 39.82758621
581 39.75903614
582 39.69072165
583 39.79416810
584 39.89726027
585 40.00000000
586 39.93174061
587 39.86371380
588 39.79591837
589 39.72835314
590 39.66101695
591 39.59390863
592 39.52702703
593 39.62900506
594 39.73063973
595 39.66386555
596 39.76510067
597 39.69849246
598 39.63210702
599 39.73288815
600 39.66666667
601 39.76705491
602 39.70099668
603 39.80099502
604 39.73509934
605 39.66942149
606 39.60396040
607 39.70345964
608 39.63815789
609 39.57307061
610 39.50819672
611 39.60720131
612 39.54248366
613 39.64110930
614 39.57654723
615 39.51219512
616 39.61038961
617 39.54619125
618 39.48220065
619 39.41841680
620 39.35483871
621 39.29146538
622 39.38906752
623 39.32584270
624 39.26282051
625 39.20000000
626 39.29712460
627 39.23444976
628 39.17197452
629 39.26868045
630 39.20634921
631 39.30269414
632 39.39873418
633 39.33649289
634 39.27444795
635 39.21259843
636 39.15094340
637 39.24646782
638 39.18495298
639 39.12363067
640 39.21875000
641 39.15756630
642 39.09657321
643 39.03576983
644 39.13043478
645 39.06976744
646 39.00928793
647 38.94899536
648 38.88888889
649 38.82896764
650 38.76923077
651 38.70967742
652 38.65030675
653 38.59111792
654 38.68501529
655 38.62595420
656 38.71951220
657 38.81278539
658 38.90577508
659 38.84673748
660 38.78787879
661 38.72919818
662 38.67069486
663 38.61236802
664 38.70481928
665 38.79699248
666 38.73873874
667 38.83058471
668 38.92215569
669 39.01345291
670 38.95522388
671 39.04619970
672 38.98809524
673 38.93016345
674 38.87240356
675 38.81481481
676 38.75739645
677 38.70014771
678 38.79056047
679 38.73343152
680 38.82352941
681 38.91336270
682 38.85630499
683 38.79941435
684 38.88888889
685 38.97810219
686 38.92128280
687 38.86462882
688 38.80813953
689 38.89695210
690 38.84057971
691 38.78437048
692 38.72832370
693 38.67243867
694 38.61671470
695 38.56115108
696 38.64942529
697 38.59397418
698 38.68194842
699 38.76967096
700 38.71428571
701 38.65905849
702 38.74643875
703 38.83357041
704 38.92045455
705 38.86524823
706 38.81019830
707 38.75530410
708 38.70056497
709 38.64598025
710 38.59154930
711 38.67791842
712 38.62359551
713 38.70967742
714 38.79551821
715 38.74125874
716 38.68715084
717 38.63319386
718 38.57938719
719 38.52573018
720 38.47222222
721 38.41886269
722 38.36565097
723 38.45089903
724 38.53591160
725 38.62068966
726 38.56749311
727 38.65199450
728 38.59890110
729 38.54595336
730 38.49315068
731 38.44049248
732 38.38797814
733 38.47203274
734 38.41961853
735 38.36734694
736 38.31521739
737 38.39891452
738 38.48238482
739 38.56562923
740 38.51351351
741 38.46153846
742 38.54447439
743 38.62718708
744 38.70967742
745 38.65771812
746 38.60589812
747 38.55421687
748 38.50267380
749 38.45126836
750 38.53333333
Final result: 38.5333 +/- 1.7783
Random chance: 25.0000 +/- 1.5822