File size: 12,593 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 817 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 817 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	25.00000000
5	20.00000000
6	16.66666667
7	14.28571429
8	12.50000000
9	11.11111111
10	10.00000000
11	9.09090909
12	16.66666667
13	15.38461538
14	14.28571429
15	13.33333333
16	18.75000000
17	17.64705882
18	16.66666667
19	21.05263158
20	20.00000000
21	19.04761905
22	18.18181818
23	17.39130435
24	20.83333333
25	20.00000000
26	23.07692308
27	22.22222222
28	21.42857143
29	24.13793103
30	23.33333333
31	25.80645161
32	25.00000000
33	27.27272727
34	26.47058824
35	25.71428571
36	25.00000000
37	27.02702703
38	26.31578947
39	25.64102564
40	25.00000000
41	24.39024390
42	23.80952381
43	25.58139535
44	27.27272727
45	26.66666667
46	28.26086957
47	27.65957447
48	29.16666667
49	28.57142857
50	28.00000000
51	27.45098039
52	26.92307692
53	28.30188679
54	29.62962963
55	30.90909091
56	30.35714286
57	29.82456140
58	29.31034483
59	28.81355932
60	30.00000000
61	29.50819672
62	30.64516129
63	30.15873016
64	29.68750000
65	29.23076923
66	28.78787879
67	28.35820896
68	27.94117647
69	27.53623188
70	28.57142857
71	29.57746479
72	30.55555556
73	31.50684932
74	31.08108108
75	30.66666667
76	31.57894737
77	31.16883117
78	30.76923077
79	30.37974684
80	30.00000000
81	29.62962963
82	30.48780488
83	30.12048193
84	29.76190476
85	29.41176471
86	29.06976744
87	28.73563218
88	28.40909091
89	28.08988764
90	27.77777778
91	28.57142857
92	28.26086957
93	27.95698925
94	27.65957447
95	28.42105263
96	29.16666667
97	28.86597938
98	28.57142857
99	29.29292929
100	30.00000000
101	30.69306931
102	30.39215686
103	31.06796117
104	30.76923077
105	30.47619048
106	31.13207547
107	31.77570093
108	32.40740741
109	32.11009174
110	32.72727273
111	32.43243243
112	32.14285714
113	31.85840708
114	32.45614035
115	32.17391304
116	31.89655172
117	31.62393162
118	31.35593220
119	31.09243697
120	31.66666667
121	32.23140496
122	31.96721311
123	32.52032520
124	33.06451613
125	33.60000000
126	34.12698413
127	33.85826772
128	33.59375000
129	33.33333333
130	33.07692308
131	32.82442748
132	32.57575758
133	32.33082707
134	32.83582090
135	32.59259259
136	32.35294118
137	32.84671533
138	33.33333333
139	33.09352518
140	33.57142857
141	33.33333333
142	33.09859155
143	32.86713287
144	33.33333333
145	33.10344828
146	32.87671233
147	33.33333333
148	33.10810811
149	32.88590604
150	32.66666667
151	33.11258278
152	32.89473684
153	32.67973856
154	32.46753247
155	32.90322581
156	33.33333333
157	33.12101911
158	33.54430380
159	33.33333333
160	33.12500000
161	32.91925466
162	32.71604938
163	33.12883436
164	32.92682927
165	33.33333333
166	33.73493976
167	33.53293413
168	33.33333333
169	33.13609467
170	32.94117647
171	32.74853801
172	32.55813953
173	32.36994220
174	32.18390805
175	32.00000000
176	32.38636364
177	32.20338983
178	32.02247191
179	31.84357542
180	31.66666667
181	31.49171271
182	31.31868132
183	31.14754098
184	30.97826087
185	30.81081081
186	31.18279570
187	31.01604278
188	30.85106383
189	30.68783069
190	30.52631579
191	30.36649215
192	30.20833333
193	30.05181347
194	30.41237113
195	30.25641026
196	30.10204082
197	30.45685279
198	30.80808081
199	30.65326633
200	31.00000000
201	30.84577114
202	31.18811881
203	31.03448276
204	30.88235294
205	30.73170732
206	30.58252427
207	30.91787440
208	30.76923077
209	30.62200957
210	30.47619048
211	30.33175355
212	30.18867925
213	30.04694836
214	29.90654206
215	29.76744186
216	29.62962963
217	29.49308756
218	29.81651376
219	29.68036530
220	29.54545455
221	29.41176471
222	29.27927928
223	29.59641256
224	29.91071429
225	29.77777778
226	29.64601770
227	29.51541850
228	29.38596491
229	29.25764192
230	29.13043478
231	29.00432900
232	29.31034483
233	29.61373391
234	29.91452991
235	29.78723404
236	30.08474576
237	30.37974684
238	30.67226891
239	30.54393305
240	30.41666667
241	30.29045643
242	30.16528926
243	30.04115226
244	29.91803279
245	29.79591837
246	29.67479675
247	29.55465587
248	29.83870968
249	29.71887550
250	29.60000000
251	29.88047809
252	29.76190476
253	29.64426877
254	29.52755906
255	29.41176471
256	29.29687500
257	29.18287938
258	29.06976744
259	28.95752896
260	28.84615385
261	28.73563218
262	28.62595420
263	28.51711027
264	28.78787879
265	29.05660377
266	28.94736842
267	29.21348315
268	29.10447761
269	28.99628253
270	28.88888889
271	28.78228782
272	28.67647059
273	28.57142857
274	28.46715328
275	28.36363636
276	28.26086957
277	28.15884477
278	28.05755396
279	27.95698925
280	27.85714286
281	28.11387900
282	28.01418440
283	27.91519435
284	27.81690141
285	27.71929825
286	27.97202797
287	27.87456446
288	27.77777778
289	27.68166090
290	27.58620690
291	27.83505155
292	28.08219178
293	28.32764505
294	28.23129252
295	28.47457627
296	28.71621622
297	28.95622896
298	28.85906040
299	28.76254181
300	28.66666667
301	28.57142857
302	28.80794702
303	28.71287129
304	28.61842105
305	28.52459016
306	28.43137255
307	28.33876221
308	28.57142857
309	28.47896440
310	28.38709677
311	28.29581994
312	28.52564103
313	28.43450479
314	28.34394904
315	28.25396825
316	28.16455696
317	28.07570978
318	27.98742138
319	27.89968652
320	27.81250000
321	27.72585670
322	27.95031056
323	27.86377709
324	27.77777778
325	27.69230769
326	27.60736196
327	27.52293578
328	27.74390244
329	27.65957447
330	27.87878788
331	27.79456193
332	28.01204819
333	28.22822823
334	28.14371257
335	28.35820896
336	28.57142857
337	28.48664688
338	28.40236686
339	28.61356932
340	28.52941176
341	28.73900293
342	28.94736842
343	28.86297376
344	29.06976744
345	29.27536232
346	29.47976879
347	29.68299712
348	29.59770115
349	29.51289398
350	29.42857143
351	29.34472934
352	29.54545455
353	29.74504249
354	29.66101695
355	29.85915493
356	29.77528090
357	29.69187675
358	29.88826816
359	29.80501393
360	29.72222222
361	29.63988920
362	29.55801105
363	29.47658402
364	29.39560440
365	29.58904110
366	29.50819672
367	29.70027248
368	29.61956522
369	29.53929539
370	29.45945946
371	29.38005391
372	29.56989247
373	29.49061662
374	29.67914439
375	29.60000000
376	29.52127660
377	29.44297082
378	29.36507937
379	29.28759894
380	29.21052632
381	29.13385827
382	29.05759162
383	28.98172324
384	28.90625000
385	28.83116883
386	28.75647668
387	28.68217054
388	28.86597938
389	28.79177378
390	28.71794872
391	28.90025575
392	28.82653061
393	29.00763359
394	29.18781726
395	29.11392405
396	29.04040404
397	29.21914358
398	29.14572864
399	29.07268170
400	29.00000000
401	28.92768080
402	29.10447761
403	29.03225806
404	28.96039604
405	28.88888889
406	28.81773399
407	28.74692875
408	28.67647059
409	28.60635697
410	28.53658537
411	28.46715328
412	28.64077670
413	28.57142857
414	28.74396135
415	28.67469880
416	28.60576923
417	28.53717026
418	28.46889952
419	28.40095465
420	28.57142857
421	28.74109264
422	28.67298578
423	28.60520095
424	28.53773585
425	28.47058824
426	28.40375587
427	28.33723653
428	28.27102804
429	28.20512821
430	28.13953488
431	28.07424594
432	28.24074074
433	28.40646651
434	28.57142857
435	28.50574713
436	28.44036697
437	28.37528604
438	28.31050228
439	28.24601367
440	28.18181818
441	28.11791383
442	28.05429864
443	27.99097065
444	28.15315315
445	28.08988764
446	28.02690583
447	27.96420582
448	27.90178571
449	28.06236080
450	28.00000000
451	27.93791574
452	27.87610619
453	27.81456954
454	27.75330396
455	27.69230769
456	27.63157895
457	27.57111597
458	27.51091703
459	27.45098039
460	27.39130435
461	27.33188720
462	27.27272727
463	27.21382289
464	27.15517241
465	27.31182796
466	27.25321888
467	27.40899358
468	27.35042735
469	27.29211087
470	27.23404255
471	27.17622081
472	27.11864407
473	27.27272727
474	27.42616034
475	27.57894737
476	27.52100840
477	27.67295597
478	27.61506276
479	27.55741127
480	27.50000000
481	27.44282744
482	27.38589212
483	27.32919255
484	27.47933884
485	27.62886598
486	27.57201646
487	27.72073922
488	27.66393443
489	27.81186094
490	27.95918367
491	27.90224033
492	28.04878049
493	27.99188641
494	27.93522267
495	28.08080808
496	28.02419355
497	27.96780684
498	27.91164659
499	27.85571142
500	27.80000000
501	27.94411178
502	28.08764940
503	28.03180915
504	28.17460317
505	28.31683168
506	28.26086957
507	28.20512821
508	28.14960630
509	28.29076621
510	28.43137255
511	28.37573386
512	28.32031250
513	28.26510721
514	28.21011673
515	28.34951456
516	28.29457364
517	28.43326886
518	28.37837838
519	28.32369942
520	28.26923077
521	28.40690979
522	28.35249042
523	28.48948375
524	28.43511450
525	28.38095238
526	28.32699620
527	28.46299810
528	28.40909091
529	28.35538752
530	28.30188679
531	28.43691149
532	28.38345865
533	28.51782364
534	28.46441948
535	28.41121495
536	28.35820896
537	28.49162011
538	28.43866171
539	28.38589981
540	28.51851852
541	28.46580407
542	28.41328413
543	28.36095764
544	28.30882353
545	28.25688073
546	28.38827839
547	28.51919561
548	28.64963504
549	28.59744991
550	28.54545455
551	28.49364791
552	28.62318841
553	28.75226040
554	28.70036101
555	28.64864865
556	28.77697842
557	28.72531418
558	28.67383513
559	28.62254025
560	28.57142857
561	28.52049911
562	28.46975089
563	28.41918295
564	28.54609929
565	28.67256637
566	28.62190813
567	28.57142857
568	28.52112676
569	28.47100176
570	28.42105263
571	28.37127846
572	28.32167832
573	28.27225131
574	28.39721254
575	28.52173913
576	28.47222222
577	28.42287695
578	28.54671280
579	28.67012090
580	28.79310345
581	28.91566265
582	29.03780069
583	28.98799314
584	29.10958904
585	29.05982906
586	29.18088737
587	29.30153322
588	29.25170068
589	29.20203735
590	29.32203390
591	29.27241963
592	29.22297297
593	29.17369309
594	29.29292929
595	29.24369748
596	29.19463087
597	29.31323283
598	29.26421405
599	29.21535893
600	29.16666667
601	29.28452579
602	29.40199336
603	29.35323383
604	29.30463576
605	29.25619835
606	29.37293729
607	29.32454695
608	29.27631579
609	29.22824302
610	29.18032787
611	29.13256956
612	29.24836601
613	29.20065253
614	29.15309446
615	29.10569106
616	29.22077922
617	29.33549433
618	29.28802589
619	29.40226171
620	29.35483871
621	29.30756844
622	29.42122186
623	29.37399679
624	29.32692308
625	29.28000000
626	29.23322684
627	29.34609250
628	29.29936306
629	29.25278219
630	29.20634921
631	29.31854200
632	29.27215190
633	29.22590837
634	29.33753943
635	29.29133858
636	29.24528302
637	29.35635793
638	29.31034483
639	29.26447574
640	29.21875000
641	29.17316693
642	29.28348910
643	29.23794712
644	29.34782609
645	29.45736434
646	29.41176471
647	29.36630603
648	29.47530864
649	29.42989214
650	29.53846154
651	29.64669739
652	29.75460123
653	29.86217458
654	29.81651376
655	29.77099237
656	29.72560976
657	29.83257230
658	29.78723404
659	29.74203338
660	29.69696970
661	29.65204236
662	29.60725076
663	29.56259427
664	29.51807229
665	29.62406015
666	29.72972973
667	29.68515742
668	29.64071856
669	29.74588939
670	29.70149254
671	29.65722802
672	29.61309524
673	29.56909361
674	29.52522255
675	29.62962963
676	29.73372781
677	29.68980798
678	29.64601770
679	29.60235641
680	29.70588235
681	29.66226138
682	29.61876833
683	29.57540264
684	29.67836257
685	29.63503650
686	29.59183673
687	29.54876274
688	29.65116279
689	29.60812772
690	29.56521739
691	29.52243126
692	29.47976879
693	29.58152958
694	29.68299712
695	29.64028777
696	29.74137931
697	29.69870875
698	29.65616046
699	29.75679542
700	29.85714286
701	29.81455064
702	29.77207977
703	29.72972973
704	29.82954545
705	29.78723404
706	29.74504249
707	29.70297030
708	29.66101695
709	29.61918195
710	29.57746479
711	29.53586498
712	29.63483146
713	29.59326788
714	29.55182073
715	29.51048951
716	29.46927374
717	29.42817294
718	29.38718663
719	29.48539638
720	29.44444444
721	29.40360610
722	29.36288089
723	29.32226833
724	29.28176796
725	29.24137931
726	29.33884298
727	29.29848693
728	29.25824176
729	29.21810700
730	29.31506849
731	29.27496580
732	29.37158470
733	29.33151432
734	29.29155313
735	29.25170068
736	29.21195652
737	29.17232022
738	29.13279133
739	29.09336942
740	29.05405405
741	29.01484480
742	28.97574124
743	28.93674293
744	29.03225806
745	28.99328859
746	29.08847185
747	29.04953146
748	29.01069519
749	29.10547397
750	29.06666667

Final result: 29.0667 ±1.6591
Random chance: 19.8992 ±1.4588