File size: 12,593 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 817 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 817 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	25.00000000
5	20.00000000
6	16.66666667
7	14.28571429
8	12.50000000
9	11.11111111
10	10.00000000
11	9.09090909
12	16.66666667
13	15.38461538
14	14.28571429
15	13.33333333
16	12.50000000
17	17.64705882
18	16.66666667
19	21.05263158
20	20.00000000
21	19.04761905
22	18.18181818
23	17.39130435
24	20.83333333
25	20.00000000
26	23.07692308
27	22.22222222
28	21.42857143
29	24.13793103
30	23.33333333
31	25.80645161
32	25.00000000
33	27.27272727
34	26.47058824
35	25.71428571
36	25.00000000
37	27.02702703
38	26.31578947
39	25.64102564
40	25.00000000
41	24.39024390
42	23.80952381
43	25.58139535
44	25.00000000
45	24.44444444
46	26.08695652
47	25.53191489
48	27.08333333
49	26.53061224
50	28.00000000
51	27.45098039
52	26.92307692
53	28.30188679
54	29.62962963
55	30.90909091
56	30.35714286
57	29.82456140
58	29.31034483
59	28.81355932
60	30.00000000
61	29.50819672
62	30.64516129
63	30.15873016
64	29.68750000
65	29.23076923
66	28.78787879
67	28.35820896
68	27.94117647
69	27.53623188
70	27.14285714
71	28.16901408
72	29.16666667
73	30.13698630
74	29.72972973
75	29.33333333
76	30.26315789
77	29.87012987
78	29.48717949
79	29.11392405
80	28.75000000
81	28.39506173
82	28.04878049
83	27.71084337
84	27.38095238
85	27.05882353
86	26.74418605
87	26.43678161
88	26.13636364
89	26.96629213
90	26.66666667
91	26.37362637
92	26.08695652
93	25.80645161
94	25.53191489
95	26.31578947
96	27.08333333
97	26.80412371
98	26.53061224
99	27.27272727
100	27.00000000
101	27.72277228
102	27.45098039
103	28.15533981
104	27.88461538
105	27.61904762
106	27.35849057
107	28.03738318
108	28.70370370
109	28.44036697
110	29.09090909
111	28.82882883
112	28.57142857
113	28.31858407
114	28.07017544
115	27.82608696
116	27.58620690
117	27.35042735
118	27.11864407
119	26.89075630
120	26.66666667
121	27.27272727
122	27.04918033
123	26.82926829
124	27.41935484
125	28.00000000
126	28.57142857
127	28.34645669
128	28.12500000
129	27.90697674
130	27.69230769
131	27.48091603
132	27.27272727
133	27.06766917
134	27.61194030
135	27.40740741
136	27.20588235
137	27.73722628
138	27.53623188
139	28.05755396
140	27.85714286
141	27.65957447
142	27.46478873
143	27.27272727
144	27.08333333
145	26.89655172
146	26.71232877
147	26.53061224
148	26.35135135
149	26.17449664
150	26.00000000
151	26.49006623
152	26.31578947
153	26.14379085
154	25.97402597
155	25.80645161
156	26.28205128
157	26.11464968
158	26.58227848
159	26.41509434
160	26.25000000
161	26.08695652
162	25.92592593
163	25.76687117
164	25.60975610
165	26.06060606
166	26.50602410
167	26.34730539
168	26.19047619
169	26.03550296
170	25.88235294
171	25.73099415
172	25.58139535
173	25.43352601
174	25.28735632
175	25.14285714
176	25.56818182
177	25.42372881
178	25.28089888
179	25.13966480
180	25.00000000
181	24.86187845
182	24.72527473
183	24.59016393
184	24.45652174
185	24.32432432
186	24.73118280
187	24.59893048
188	24.46808511
189	24.86772487
190	24.73684211
191	24.60732984
192	24.47916667
193	24.35233161
194	24.74226804
195	24.61538462
196	24.48979592
197	24.87309645
198	25.25252525
199	25.12562814
200	25.00000000
201	24.87562189
202	25.24752475
203	25.12315271
204	25.00000000
205	25.36585366
206	25.24271845
207	25.60386473
208	25.48076923
209	25.35885167
210	25.23809524
211	25.11848341
212	25.00000000
213	24.88262911
214	24.76635514
215	24.65116279
216	25.00000000
217	24.88479263
218	25.22935780
219	25.11415525
220	25.00000000
221	24.88687783
222	24.77477477
223	24.66367713
224	25.00000000
225	24.88888889
226	24.77876106
227	24.66960352
228	24.56140351
229	24.45414847
230	24.34782609
231	24.24242424
232	24.56896552
233	24.89270386
234	24.78632479
235	24.68085106
236	25.00000000
237	25.31645570
238	25.63025210
239	25.52301255
240	25.41666667
241	25.31120332
242	25.20661157
243	25.10288066
244	25.00000000
245	24.89795918
246	24.79674797
247	24.69635628
248	24.59677419
249	24.49799197
250	24.40000000
251	24.70119522
252	24.60317460
253	24.50592885
254	24.40944882
255	24.31372549
256	24.21875000
257	24.12451362
258	24.03100775
259	23.93822394
260	23.84615385
261	24.13793103
262	24.04580153
263	23.95437262
264	24.24242424
265	24.52830189
266	24.43609023
267	24.71910112
268	24.62686567
269	24.53531599
270	24.44444444
271	24.35424354
272	24.26470588
273	24.17582418
274	24.08759124
275	24.00000000
276	23.91304348
277	23.82671480
278	23.74100719
279	23.65591398
280	23.57142857
281	23.84341637
282	23.75886525
283	23.67491166
284	23.94366197
285	23.85964912
286	24.12587413
287	24.04181185
288	23.95833333
289	23.87543253
290	23.79310345
291	24.05498282
292	24.31506849
293	24.57337884
294	24.48979592
295	24.74576271
296	25.00000000
297	25.25252525
298	25.16778523
299	25.08361204
300	25.00000000
301	24.91694352
302	24.83443709
303	24.75247525
304	24.67105263
305	24.59016393
306	24.50980392
307	24.42996743
308	24.67532468
309	24.59546926
310	24.51612903
311	24.43729904
312	24.67948718
313	24.60063898
314	24.52229299
315	24.44444444
316	24.36708861
317	24.29022082
318	24.21383648
319	24.13793103
320	24.06250000
321	23.98753894
322	24.22360248
323	24.14860681
324	24.07407407
325	24.00000000
326	23.92638037
327	23.85321101
328	24.08536585
329	24.01215805
330	24.24242424
331	24.16918429
332	24.39759036
333	24.32432432
334	24.25149701
335	24.47761194
336	24.70238095
337	24.62908012
338	24.55621302
339	24.77876106
340	24.70588235
341	24.92668622
342	25.14619883
343	25.07288630
344	25.29069767
345	25.50724638
346	25.72254335
347	25.93659942
348	25.86206897
349	25.78796562
350	25.71428571
351	25.64102564
352	25.85227273
353	25.77903683
354	25.70621469
355	25.91549296
356	25.84269663
357	25.77030812
358	25.97765363
359	25.90529248
360	25.83333333
361	25.76177285
362	25.69060773
363	25.89531680
364	25.82417582
365	26.02739726
366	25.95628415
367	26.15803815
368	26.08695652
369	26.01626016
370	25.94594595
371	25.87601078
372	26.07526882
373	26.00536193
374	26.20320856
375	26.13333333
376	26.06382979
377	25.99469496
378	25.92592593
379	25.85751979
380	25.78947368
381	25.72178478
382	25.65445026
383	25.58746736
384	25.52083333
385	25.45454545
386	25.38860104
387	25.32299742
388	25.51546392
389	25.44987147
390	25.38461538
391	25.57544757
392	25.51020408
393	25.44529262
394	25.63451777
395	25.56962025
396	25.50505051
397	25.69269521
398	25.62814070
399	25.56390977
400	25.50000000
401	25.43640898
402	25.37313433
403	25.55831266
404	25.49504950
405	25.43209877
406	25.36945813
407	25.30712531
408	25.24509804
409	25.18337408
410	25.12195122
411	25.06082725
412	25.24271845
413	25.18159806
414	25.36231884
415	25.30120482
416	25.24038462
417	25.17985612
418	25.11961722
419	25.05966587
420	25.23809524
421	25.41567696
422	25.35545024
423	25.29550827
424	25.23584906
425	25.17647059
426	25.11737089
427	25.05854801
428	25.00000000
429	24.94172494
430	24.88372093
431	24.82598608
432	25.00000000
433	25.17321016
434	25.34562212
435	25.28735632
436	25.22935780
437	25.17162471
438	25.11415525
439	25.05694761
440	25.22727273
441	25.17006803
442	25.33936652
443	25.28216704
444	25.22522523
445	25.16853933
446	25.11210762
447	25.05592841
448	25.00000000
449	25.16703786
450	25.11111111
451	25.05543237
452	25.00000000
453	25.16556291
454	25.11013216
455	25.05494505
456	25.00000000
457	24.94529540
458	24.89082969
459	24.83660131
460	24.78260870
461	24.72885033
462	24.67532468
463	24.62203024
464	24.56896552
465	24.73118280
466	24.67811159
467	24.62526767
468	24.57264957
469	24.52025586
470	24.46808511
471	24.41613588
472	24.36440678
473	24.31289641
474	24.26160338
475	24.42105263
476	24.36974790
477	24.52830189
478	24.47698745
479	24.42588727
480	24.37500000
481	24.32432432
482	24.48132780
483	24.43064182
484	24.58677686
485	24.74226804
486	24.69135802
487	24.84599589
488	24.79508197
489	24.94887526
490	25.10204082
491	25.05091650
492	25.20325203
493	25.15212982
494	25.10121457
495	25.25252525
496	25.20161290
497	25.15090543
498	25.10040161
499	25.05010020
500	25.00000000
501	25.14970060
502	25.09960159
503	25.04970179
504	25.19841270
505	25.14851485
506	25.09881423
507	25.04930966
508	25.00000000
509	25.14734774
510	25.29411765
511	25.24461840
512	25.19531250
513	25.14619883
514	25.09727626
515	25.24271845
516	25.19379845
517	25.33849130
518	25.28957529
519	25.24084778
520	25.19230769
521	25.33589251
522	25.28735632
523	25.43021033
524	25.38167939
525	25.33333333
526	25.28517110
527	25.42694497
528	25.56818182
529	25.51984877
530	25.47169811
531	25.61205273
532	25.56390977
533	25.70356473
534	25.65543071
535	25.60747664
536	25.55970149
537	25.69832402
538	25.65055762
539	25.60296846
540	25.74074074
541	25.69316081
542	25.64575646
543	25.59852670
544	25.55147059
545	25.50458716
546	25.64102564
547	25.59414991
548	25.72992701
549	25.68306011
550	25.63636364
551	25.58983666
552	25.72463768
553	25.67811935
554	25.63176895
555	25.58558559
556	25.71942446
557	25.67324955
558	25.62724014
559	25.58139535
560	25.53571429
561	25.49019608
562	25.44483986
563	25.39964476
564	25.53191489
565	25.66371681
566	25.61837456
567	25.57319224
568	25.70422535
569	25.83479789
570	25.78947368
571	25.74430823
572	25.69930070
573	25.65445026
574	25.78397213
575	25.91304348
576	25.86805556
577	25.82322357
578	25.95155709
579	26.07944732
580	26.20689655
581	26.16179002
582	26.28865979
583	26.24356775
584	26.36986301
585	26.32478632
586	26.45051195
587	26.40545145
588	26.36054422
589	26.31578947
590	26.44067797
591	26.39593909
592	26.35135135
593	26.30691400
594	26.26262626
595	26.21848739
596	26.17449664
597	26.29815745
598	26.25418060
599	26.21035058
600	26.16666667
601	26.28951747
602	26.41196013
603	26.36815920
604	26.32450331
605	26.28099174
606	26.40264026
607	26.35914333
608	26.31578947
609	26.27257800
610	26.22950820
611	26.18657938
612	26.30718954
613	26.26427406
614	26.22149837
615	26.17886179
616	26.29870130
617	26.41815235
618	26.37540453
619	26.49434572
620	26.45161290
621	26.40901771
622	26.52733119
623	26.48475120
624	26.44230769
625	26.40000000
626	26.35782748
627	26.31578947
628	26.27388535
629	26.23211447
630	26.19047619
631	26.30744849
632	26.42405063
633	26.38230648
634	26.49842271
635	26.45669291
636	26.41509434
637	26.37362637
638	26.33228840
639	26.29107981
640	26.25000000
641	26.20904836
642	26.32398754
643	26.28304821
644	26.39751553
645	26.51162791
646	26.47058824
647	26.42967543
648	26.54320988
649	26.50231125
650	26.61538462
651	26.72811060
652	26.84049080
653	26.95252680
654	26.91131498
655	26.87022901
656	26.82926829
657	26.94063927
658	26.89969605
659	26.85887709
660	26.81818182
661	26.77760968
662	26.73716012
663	26.69683258
664	26.65662651
665	26.76691729
666	26.87687688
667	26.83658171
668	26.79640719
669	26.75635277
670	26.71641791
671	26.67660209
672	26.63690476
673	26.59732541
674	26.55786350
675	26.66666667
676	26.77514793
677	26.73559823
678	26.69616519
679	26.65684831
680	26.76470588
681	26.72540382
682	26.68621701
683	26.64714495
684	26.75438596
685	26.71532847
686	26.67638484
687	26.63755459
688	26.74418605
689	26.85050798
690	26.81159420
691	26.91751085
692	26.87861272
693	26.83982684
694	26.94524496
695	26.90647482
696	27.01149425
697	26.97274032
698	26.93409742
699	27.03862661
700	27.14285714
701	27.10413695
702	27.06552707
703	27.02702703
704	26.98863636
705	26.95035461
706	26.91218130
707	26.87411598
708	26.83615819
709	26.79830748
710	26.76056338
711	26.72292546
712	26.82584270
713	26.78821879
714	26.75070028
715	26.71328671
716	26.67597765
717	26.63877266
718	26.60167131
719	26.56467316
720	26.52777778
721	26.49098474
722	26.45429363
723	26.41770401
724	26.38121547
725	26.34482759
726	26.44628099
727	26.40990371
728	26.37362637
729	26.33744856
730	26.43835616
731	26.40218878
732	26.50273224
733	26.46657572
734	26.43051771
735	26.39455782
736	26.35869565
737	26.32293080
738	26.28726287
739	26.38700947
740	26.35135135
741	26.31578947
742	26.28032345
743	26.24495289
744	26.34408602
745	26.30872483
746	26.40750670
747	26.50602410
748	26.47058824
749	26.56875834
750	26.66666667

Final result: 26.6667 ±1.6158
Random chance: 19.8992 ±1.4588