File size: 12,593 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 817 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 817 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	25.00000000
5	20.00000000
6	16.66666667
7	14.28571429
8	12.50000000
9	11.11111111
10	10.00000000
11	9.09090909
12	16.66666667
13	15.38461538
14	14.28571429
15	13.33333333
16	18.75000000
17	17.64705882
18	16.66666667
19	21.05263158
20	20.00000000
21	19.04761905
22	18.18181818
23	17.39130435
24	20.83333333
25	20.00000000
26	23.07692308
27	22.22222222
28	21.42857143
29	24.13793103
30	23.33333333
31	25.80645161
32	25.00000000
33	27.27272727
34	26.47058824
35	25.71428571
36	25.00000000
37	27.02702703
38	26.31578947
39	25.64102564
40	25.00000000
41	24.39024390
42	23.80952381
43	25.58139535
44	25.00000000
45	24.44444444
46	26.08695652
47	25.53191489
48	27.08333333
49	26.53061224
50	28.00000000
51	27.45098039
52	26.92307692
53	28.30188679
54	29.62962963
55	30.90909091
56	30.35714286
57	29.82456140
58	29.31034483
59	28.81355932
60	30.00000000
61	29.50819672
62	30.64516129
63	30.15873016
64	29.68750000
65	29.23076923
66	28.78787879
67	28.35820896
68	27.94117647
69	27.53623188
70	28.57142857
71	29.57746479
72	30.55555556
73	31.50684932
74	31.08108108
75	30.66666667
76	31.57894737
77	31.16883117
78	30.76923077
79	30.37974684
80	30.00000000
81	29.62962963
82	29.26829268
83	28.91566265
84	28.57142857
85	28.23529412
86	27.90697674
87	27.58620690
88	27.27272727
89	28.08988764
90	27.77777778
91	27.47252747
92	27.17391304
93	26.88172043
94	26.59574468
95	27.36842105
96	28.12500000
97	27.83505155
98	27.55102041
99	28.28282828
100	29.00000000
101	29.70297030
102	29.41176471
103	30.09708738
104	29.80769231
105	29.52380952
106	30.18867925
107	30.84112150
108	31.48148148
109	31.19266055
110	31.81818182
111	31.53153153
112	31.25000000
113	30.97345133
114	31.57894737
115	31.30434783
116	31.03448276
117	30.76923077
118	30.50847458
119	30.25210084
120	30.00000000
121	30.57851240
122	30.32786885
123	30.89430894
124	31.45161290
125	32.00000000
126	32.53968254
127	32.28346457
128	32.03125000
129	31.78294574
130	31.53846154
131	31.29770992
132	31.06060606
133	30.82706767
134	31.34328358
135	31.85185185
136	31.61764706
137	32.11678832
138	32.60869565
139	33.09352518
140	33.57142857
141	33.33333333
142	33.09859155
143	32.86713287
144	33.33333333
145	33.10344828
146	32.87671233
147	33.33333333
148	33.10810811
149	32.88590604
150	32.66666667
151	33.11258278
152	32.89473684
153	32.67973856
154	32.46753247
155	32.25806452
156	32.69230769
157	32.48407643
158	32.91139241
159	32.70440252
160	32.50000000
161	32.29813665
162	32.09876543
163	32.51533742
164	32.31707317
165	32.72727273
166	33.13253012
167	32.93413174
168	32.73809524
169	32.54437870
170	32.35294118
171	32.16374269
172	31.97674419
173	32.36994220
174	32.18390805
175	32.00000000
176	32.38636364
177	32.20338983
178	32.02247191
179	31.84357542
180	31.66666667
181	31.49171271
182	31.31868132
183	31.14754098
184	30.97826087
185	30.81081081
186	31.18279570
187	31.01604278
188	30.85106383
189	31.21693122
190	31.05263158
191	30.89005236
192	30.72916667
193	30.56994819
194	30.92783505
195	30.76923077
196	30.61224490
197	30.96446701
198	31.31313131
199	31.15577889
200	31.50000000
201	31.34328358
202	31.68316832
203	31.52709360
204	31.37254902
205	31.70731707
206	31.55339806
207	31.88405797
208	31.73076923
209	32.05741627
210	31.90476190
211	31.75355450
212	31.60377358
213	31.45539906
214	31.30841121
215	31.16279070
216	31.01851852
217	30.87557604
218	31.19266055
219	31.05022831
220	30.90909091
221	30.76923077
222	30.63063063
223	30.49327354
224	30.80357143
225	30.66666667
226	30.53097345
227	30.39647577
228	30.26315789
229	30.13100437
230	30.00000000
231	29.87012987
232	29.74137931
233	30.04291845
234	29.91452991
235	29.78723404
236	30.08474576
237	30.37974684
238	30.67226891
239	30.54393305
240	30.41666667
241	30.29045643
242	30.16528926
243	30.04115226
244	29.91803279
245	29.79591837
246	29.67479675
247	29.55465587
248	29.83870968
249	30.12048193
250	30.00000000
251	30.27888446
252	30.15873016
253	30.03952569
254	29.92125984
255	29.80392157
256	29.68750000
257	29.57198444
258	29.45736434
259	29.34362934
260	29.23076923
261	29.50191571
262	29.38931298
263	29.27756654
264	29.54545455
265	29.81132075
266	29.69924812
267	29.96254682
268	29.85074627
269	29.73977695
270	29.62962963
271	29.52029520
272	29.41176471
273	29.30402930
274	29.19708029
275	29.09090909
276	28.98550725
277	28.88086643
278	28.77697842
279	28.67383513
280	28.57142857
281	28.82562278
282	28.72340426
283	28.62190813
284	28.52112676
285	28.42105263
286	28.67132867
287	28.57142857
288	28.47222222
289	28.37370242
290	28.27586207
291	28.52233677
292	28.76712329
293	29.01023891
294	28.91156463
295	29.15254237
296	29.39189189
297	29.62962963
298	29.53020134
299	29.43143813
300	29.33333333
301	29.23588040
302	29.13907285
303	29.04290429
304	28.94736842
305	28.85245902
306	28.75816993
307	28.66449511
308	28.89610390
309	28.80258900
310	28.70967742
311	28.61736334
312	28.84615385
313	28.75399361
314	28.66242038
315	28.57142857
316	28.48101266
317	28.39116719
318	28.30188679
319	28.21316614
320	28.12500000
321	28.03738318
322	28.26086957
323	28.17337461
324	28.08641975
325	28.00000000
326	27.91411043
327	27.82874618
328	28.04878049
329	27.96352584
330	28.18181818
331	28.09667674
332	28.31325301
333	28.52852853
334	28.44311377
335	28.65671642
336	28.86904762
337	28.78338279
338	28.69822485
339	28.90855457
340	28.82352941
341	29.03225806
342	29.23976608
343	29.15451895
344	29.36046512
345	29.56521739
346	29.76878613
347	29.97118156
348	29.88505747
349	30.08595989
350	30.00000000
351	29.91452991
352	30.11363636
353	30.31161473
354	30.22598870
355	30.42253521
356	30.33707865
357	30.25210084
358	30.44692737
359	30.36211699
360	30.27777778
361	30.19390582
362	30.11049724
363	30.30303030
364	30.21978022
365	30.41095890
366	30.32786885
367	30.51771117
368	30.43478261
369	30.35230352
370	30.27027027
371	30.18867925
372	30.37634409
373	30.29490617
374	30.48128342
375	30.40000000
376	30.31914894
377	30.23872679
378	30.42328042
379	30.34300792
380	30.26315789
381	30.18372703
382	30.10471204
383	30.02610966
384	29.94791667
385	29.87012987
386	29.79274611
387	29.71576227
388	29.89690722
389	29.82005141
390	29.74358974
391	29.66751918
392	29.59183673
393	29.51653944
394	29.69543147
395	29.62025316
396	29.54545455
397	29.72292191
398	29.64824121
399	29.57393484
400	29.50000000
401	29.42643392
402	29.60199005
403	29.52853598
404	29.45544554
405	29.38271605
406	29.31034483
407	29.23832924
408	29.16666667
409	29.09535452
410	29.02439024
411	28.95377129
412	29.12621359
413	29.05569007
414	29.22705314
415	29.15662651
416	29.08653846
417	29.01678657
418	28.94736842
419	28.87828162
420	29.04761905
421	29.21615202
422	29.14691943
423	29.07801418
424	29.00943396
425	28.94117647
426	28.87323944
427	28.80562061
428	28.73831776
429	28.67132867
430	28.60465116
431	28.53828306
432	28.70370370
433	28.86836028
434	29.03225806
435	28.96551724
436	29.12844037
437	29.06178490
438	28.99543379
439	28.92938497
440	29.09090909
441	29.02494331
442	29.18552036
443	29.11963883
444	29.27927928
445	29.21348315
446	29.14798206
447	29.08277405
448	29.01785714
449	29.17594655
450	29.11111111
451	29.04656319
452	28.98230088
453	28.91832230
454	28.85462555
455	28.79120879
456	28.72807018
457	28.66520788
458	28.60262009
459	28.54030501
460	28.47826087
461	28.41648590
462	28.35497835
463	28.29373650
464	28.23275862
465	28.38709677
466	28.32618026
467	28.26552463
468	28.20512821
469	28.14498934
470	28.08510638
471	28.02547771
472	27.96610169
473	27.90697674
474	27.84810127
475	28.00000000
476	27.94117647
477	28.09224319
478	28.03347280
479	27.97494781
480	27.91666667
481	27.85862786
482	27.80082988
483	27.95031056
484	28.09917355
485	28.24742268
486	28.18930041
487	28.13141684
488	28.07377049
489	28.22085890
490	28.36734694
491	28.30957230
492	28.45528455
493	28.39756592
494	28.34008097
495	28.48484848
496	28.42741935
497	28.37022133
498	28.31325301
499	28.25651303
500	28.20000000
501	28.34331337
502	28.48605578
503	28.42942346
504	28.57142857
505	28.51485149
506	28.45849802
507	28.40236686
508	28.34645669
509	28.48722986
510	28.62745098
511	28.57142857
512	28.51562500
513	28.46003899
514	28.40466926
515	28.54368932
516	28.48837209
517	28.62669246
518	28.57142857
519	28.51637765
520	28.46153846
521	28.59884837
522	28.54406130
523	28.68068834
524	28.62595420
525	28.76190476
526	28.70722433
527	28.65275142
528	28.78787879
529	28.73345936
530	28.67924528
531	28.81355932
532	28.75939850
533	28.89305816
534	29.02621723
535	28.97196262
536	28.91791045
537	29.05027933
538	28.99628253
539	28.94248609
540	28.88888889
541	28.83548983
542	28.78228782
543	28.72928177
544	28.67647059
545	28.62385321
546	28.75457875
547	28.70201097
548	28.83211679
549	28.77959927
550	28.72727273
551	28.67513612
552	28.80434783
553	28.75226040
554	28.70036101
555	28.64864865
556	28.77697842
557	28.72531418
558	28.67383513
559	28.62254025
560	28.75000000
561	28.69875223
562	28.64768683
563	28.59680284
564	28.72340426
565	28.84955752
566	28.79858657
567	28.74779541
568	28.87323944
569	28.99824253
570	28.94736842
571	28.89667250
572	28.84615385
573	28.79581152
574	28.91986063
575	29.04347826
576	28.99305556
577	28.94280763
578	29.06574394
579	29.01554404
580	29.13793103
581	29.25989673
582	29.20962199
583	29.15951973
584	29.28082192
585	29.23076923
586	29.35153584
587	29.30153322
588	29.25170068
589	29.20203735
590	29.32203390
591	29.27241963
592	29.22297297
593	29.17369309
594	29.12457912
595	29.07563025
596	29.02684564
597	29.14572864
598	29.09698997
599	29.04841402
600	29.00000000
601	29.11813644
602	29.23588040
603	29.18739635
604	29.13907285
605	29.09090909
606	29.20792079
607	29.15980231
608	29.11184211
609	29.06403941
610	29.01639344
611	28.96890344
612	29.08496732
613	29.03752039
614	28.99022801
615	28.94308943
616	29.05844156
617	29.17341977
618	29.12621359
619	29.24071082
620	29.19354839
621	29.14653784
622	29.26045016
623	29.21348315
624	29.16666667
625	29.12000000
626	29.07348243
627	29.18660287
628	29.14012739
629	29.09379968
630	29.04761905
631	29.16006339
632	29.11392405
633	29.06793049
634	29.17981073
635	29.13385827
636	29.24528302
637	29.35635793
638	29.31034483
639	29.26447574
640	29.21875000
641	29.17316693
642	29.28348910
643	29.23794712
644	29.34782609
645	29.45736434
646	29.41176471
647	29.36630603
648	29.47530864
649	29.42989214
650	29.53846154
651	29.64669739
652	29.75460123
653	29.86217458
654	29.81651376
655	29.77099237
656	29.72560976
657	29.83257230
658	29.78723404
659	29.74203338
660	29.69696970
661	29.65204236
662	29.60725076
663	29.56259427
664	29.51807229
665	29.62406015
666	29.72972973
667	29.68515742
668	29.64071856
669	29.59641256
670	29.55223881
671	29.50819672
672	29.46428571
673	29.42050520
674	29.37685460
675	29.48148148
676	29.58579882
677	29.54209749
678	29.49852507
679	29.45508100
680	29.55882353
681	29.51541850
682	29.47214076
683	29.42898975
684	29.53216374
685	29.48905109
686	29.44606414
687	29.40320233
688	29.50581395
689	29.60812772
690	29.56521739
691	29.66714906
692	29.62427746
693	29.72582973
694	29.68299712
695	29.78417266
696	29.88505747
697	29.84218077
698	29.79942693
699	29.89985694
700	30.00000000
701	29.95720399
702	29.91452991
703	29.87197724
704	29.82954545
705	29.78723404
706	29.74504249
707	29.70297030
708	29.66101695
709	29.61918195
710	29.57746479
711	29.67651195
712	29.77528090
713	29.73352034
714	29.69187675
715	29.65034965
716	29.60893855
717	29.56764296
718	29.52646240
719	29.62447844
720	29.58333333
721	29.54230236
722	29.50138504
723	29.46058091
724	29.41988950
725	29.37931034
726	29.47658402
727	29.43603851
728	29.39560440
729	29.35528121
730	29.45205479
731	29.41176471
732	29.50819672
733	29.46793997
734	29.42779292
735	29.38775510
736	29.34782609
737	29.44369064
738	29.40379404
739	29.49932341
740	29.45945946
741	29.55465587
742	29.51482480
743	29.47510094
744	29.56989247
745	29.53020134
746	29.62466488
747	29.58500669
748	29.54545455
749	29.50600801
750	29.46666667

Final result: 29.4667 ±1.6658
Random chance: 19.8992 ±1.4588