File size: 12,596 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	37.50000000
9	33.33333333
10	30.00000000
11	27.27272727
12	25.00000000
13	30.76923077
14	28.57142857
15	33.33333333
16	37.50000000
17	35.29411765
18	33.33333333
19	36.84210526
20	35.00000000
21	33.33333333
22	31.81818182
23	30.43478261
24	29.16666667
25	32.00000000
26	30.76923077
27	29.62962963
28	32.14285714
29	34.48275862
30	36.66666667
31	38.70967742
32	40.62500000
33	39.39393939
34	38.23529412
35	40.00000000
36	38.88888889
37	37.83783784
38	36.84210526
39	35.89743590
40	35.00000000
41	34.14634146
42	33.33333333
43	32.55813953
44	31.81818182
45	31.11111111
46	30.43478261
47	29.78723404
48	31.25000000
49	32.65306122
50	34.00000000
51	33.33333333
52	32.69230769
53	32.07547170
54	31.48148148
55	32.72727273
56	32.14285714
57	33.33333333
58	32.75862069
59	32.20338983
60	31.66666667
61	31.14754098
62	30.64516129
63	30.15873016
64	29.68750000
65	30.76923077
66	31.81818182
67	31.34328358
68	30.88235294
69	30.43478261
70	31.42857143
71	30.98591549
72	31.94444444
73	31.50684932
74	31.08108108
75	32.00000000
76	32.89473684
77	33.76623377
78	34.61538462
79	34.17721519
80	35.00000000
81	35.80246914
82	35.36585366
83	36.14457831
84	36.90476190
85	36.47058824
86	36.04651163
87	36.78160920
88	37.50000000
89	37.07865169
90	36.66666667
91	36.26373626
92	35.86956522
93	35.48387097
94	36.17021277
95	35.78947368
96	36.45833333
97	36.08247423
98	35.71428571
99	35.35353535
100	36.00000000
101	35.64356436
102	35.29411765
103	34.95145631
104	34.61538462
105	34.28571429
106	33.96226415
107	33.64485981
108	33.33333333
109	33.02752294
110	32.72727273
111	32.43243243
112	32.14285714
113	31.85840708
114	32.45614035
115	33.04347826
116	32.75862069
117	32.47863248
118	32.20338983
119	31.93277311
120	31.66666667
121	31.40495868
122	31.14754098
123	30.89430894
124	30.64516129
125	31.20000000
126	30.95238095
127	30.70866142
128	30.46875000
129	30.23255814
130	30.00000000
131	30.53435115
132	31.06060606
133	30.82706767
134	30.59701493
135	30.37037037
136	30.14705882
137	30.65693431
138	30.43478261
139	30.93525180
140	30.71428571
141	31.20567376
142	30.98591549
143	31.46853147
144	31.94444444
145	31.72413793
146	31.50684932
147	31.29251701
148	31.75675676
149	31.54362416
150	31.33333333
151	31.12582781
152	31.57894737
153	31.37254902
154	31.16883117
155	30.96774194
156	30.76923077
157	31.21019108
158	31.01265823
159	30.81761006
160	31.25000000
161	31.05590062
162	30.86419753
163	30.67484663
164	31.09756098
165	31.51515152
166	31.32530120
167	31.73652695
168	31.54761905
169	31.36094675
170	31.17647059
171	30.99415205
172	30.81395349
173	30.63583815
174	30.45977011
175	30.28571429
176	30.11363636
177	29.94350282
178	29.77528090
179	30.16759777
180	30.00000000
181	29.83425414
182	29.67032967
183	30.05464481
184	29.89130435
185	29.72972973
186	29.56989247
187	29.41176471
188	29.25531915
189	29.10052910
190	28.94736842
191	29.31937173
192	29.16666667
193	29.01554404
194	29.38144330
195	29.74358974
196	29.59183673
197	29.44162437
198	29.29292929
199	29.14572864
200	29.00000000
201	29.35323383
202	29.20792079
203	29.06403941
204	28.92156863
205	29.26829268
206	29.12621359
207	28.98550725
208	28.84615385
209	28.70813397
210	28.57142857
211	28.90995261
212	28.77358491
213	28.63849765
214	28.50467290
215	28.37209302
216	28.24074074
217	28.11059908
218	27.98165138
219	27.85388128
220	27.72727273
221	27.60180995
222	27.92792793
223	27.80269058
224	27.67857143
225	28.00000000
226	27.87610619
227	28.19383260
228	28.50877193
229	28.38427948
230	28.26086957
231	28.57142857
232	28.44827586
233	28.32618026
234	28.63247863
235	28.51063830
236	28.38983051
237	28.27004219
238	28.15126050
239	28.03347280
240	28.33333333
241	28.21576763
242	28.51239669
243	28.80658436
244	28.68852459
245	28.97959184
246	28.86178862
247	28.74493927
248	28.62903226
249	28.91566265
250	28.80000000
251	28.68525896
252	28.57142857
253	28.45849802
254	28.34645669
255	28.62745098
256	28.51562500
257	28.40466926
258	28.29457364
259	28.18532819
260	28.07692308
261	27.96934866
262	27.86259542
263	27.75665399
264	27.65151515
265	27.54716981
266	27.81954887
267	27.71535581
268	27.61194030
269	27.50929368
270	27.40740741
271	27.30627306
272	27.57352941
273	27.47252747
274	27.37226277
275	27.63636364
276	27.89855072
277	28.15884477
278	28.41726619
279	28.31541219
280	28.57142857
281	28.46975089
282	28.36879433
283	28.26855124
284	28.16901408
285	28.07017544
286	27.97202797
287	28.22299652
288	28.12500000
289	28.02768166
290	28.27586207
291	28.52233677
292	28.76712329
293	29.01023891
294	29.25170068
295	29.15254237
296	29.05405405
297	28.95622896
298	28.85906040
299	29.09698997
300	29.33333333
301	29.56810631
302	29.80132450
303	29.70297030
304	29.60526316
305	29.83606557
306	29.73856209
307	29.64169381
308	29.54545455
309	29.44983819
310	29.35483871
311	29.26045016
312	29.48717949
313	29.71246006
314	29.93630573
315	29.84126984
316	29.74683544
317	29.65299685
318	29.55974843
319	29.78056426
320	29.68750000
321	29.59501558
322	29.50310559
323	29.41176471
324	29.32098765
325	29.53846154
326	29.44785276
327	29.66360856
328	29.87804878
329	29.78723404
330	29.69696970
331	29.60725076
332	29.81927711
333	29.72972973
334	29.64071856
335	29.85074627
336	29.76190476
337	29.67359050
338	29.58579882
339	29.49852507
340	29.70588235
341	29.61876833
342	29.53216374
343	29.44606414
344	29.36046512
345	29.27536232
346	29.47976879
347	29.68299712
348	29.59770115
349	29.79942693
350	30.00000000
351	29.91452991
352	29.82954545
353	30.02832861
354	29.94350282
355	29.85915493
356	30.05617978
357	29.97198880
358	29.88826816
359	29.80501393
360	30.00000000
361	30.19390582
362	30.38674033
363	30.57851240
364	30.49450549
365	30.41095890
366	30.32786885
367	30.24523161
368	30.16304348
369	30.08130081
370	30.27027027
371	30.45822102
372	30.37634409
373	30.29490617
374	30.21390374
375	30.40000000
376	30.31914894
377	30.23872679
378	30.42328042
379	30.34300792
380	30.26315789
381	30.44619423
382	30.36649215
383	30.54830287
384	30.46875000
385	30.64935065
386	30.56994819
387	30.49095607
388	30.41237113
389	30.59125964
390	30.51282051
391	30.69053708
392	30.86734694
393	30.78880407
394	30.71065990
395	30.63291139
396	30.80808081
397	30.98236776
398	30.90452261
399	31.07769424
400	31.00000000
401	31.17206983
402	31.34328358
403	31.26550868
404	31.18811881
405	31.11111111
406	31.03448276
407	30.95823096
408	30.88235294
409	31.05134474
410	30.97560976
411	30.90024331
412	30.82524272
413	30.99273608
414	31.15942029
415	31.32530120
416	31.25000000
417	31.17505995
418	31.10047847
419	31.26491647
420	31.42857143
421	31.35391924
422	31.27962085
423	31.20567376
424	31.36792453
425	31.52941176
426	31.45539906
427	31.38173302
428	31.30841121
429	31.46853147
430	31.39534884
431	31.55452436
432	31.71296296
433	31.63972286
434	31.56682028
435	31.49425287
436	31.42201835
437	31.35011442
438	31.50684932
439	31.66287016
440	31.59090909
441	31.51927438
442	31.44796380
443	31.37697517
444	31.53153153
445	31.46067416
446	31.39013453
447	31.31991051
448	31.25000000
449	31.18040089
450	31.11111111
451	31.04212860
452	31.19469027
453	31.34657837
454	31.49779736
455	31.42857143
456	31.57894737
457	31.50984683
458	31.44104803
459	31.59041394
460	31.73913043
461	31.67028200
462	31.60173160
463	31.53347732
464	31.46551724
465	31.39784946
466	31.33047210
467	31.47751606
468	31.41025641
469	31.55650320
470	31.48936170
471	31.42250531
472	31.35593220
473	31.50105708
474	31.43459916
475	31.36842105
476	31.51260504
477	31.44654088
478	31.58995816
479	31.73277662
480	31.66666667
481	31.80873181
482	31.74273859
483	31.67701863
484	31.61157025
485	31.54639175
486	31.48148148
487	31.62217659
488	31.55737705
489	31.49284254
490	31.63265306
491	31.77189409
492	31.70731707
493	31.64300203
494	31.78137652
495	31.91919192
496	32.05645161
497	31.99195171
498	31.92771084
499	31.86372745
500	32.00000000
501	31.93612774
502	31.87250996
503	31.80914513
504	31.94444444
505	31.88118812
506	32.01581028
507	31.95266272
508	31.88976378
509	31.82711198
510	31.96078431
511	31.89823875
512	31.83593750
513	31.77387914
514	31.71206226
515	31.65048544
516	31.58914729
517	31.72147002
518	31.85328185
519	31.79190751
520	31.73076923
521	31.66986564
522	31.60919540
523	31.73996176
524	31.87022901
525	31.80952381
526	31.74904943
527	31.87855787
528	31.81818182
529	31.94706994
530	32.07547170
531	32.01506591
532	31.95488722
533	31.89493433
534	31.83520599
535	31.96261682
536	31.90298507
537	32.02979516
538	32.15613383
539	32.09647495
540	32.03703704
541	31.97781885
542	32.10332103
543	32.04419890
544	31.98529412
545	31.92660550
546	31.86813187
547	31.80987203
548	31.93430657
549	31.87613843
550	31.81818182
551	31.94192377
552	31.88405797
553	31.82640145
554	31.76895307
555	31.71171171
556	31.83453237
557	31.77737882
558	31.89964158
559	31.84257603
560	31.78571429
561	31.72905526
562	31.67259786
563	31.79396092
564	31.73758865
565	31.85840708
566	31.80212014
567	31.74603175
568	31.86619718
569	31.81019332
570	31.75438596
571	31.69877408
572	31.64335664
573	31.58813264
574	31.70731707
575	31.65217391
576	31.77083333
577	31.88908146
578	31.83391003
579	31.95164076
580	31.89655172
581	31.84165232
582	31.78694158
583	31.90394511
584	32.02054795
585	31.96581197
586	32.08191126
587	32.02725724
588	31.97278912
589	31.91850594
590	31.86440678
591	31.81049069
592	31.75675676
593	31.70320405
594	31.81818182
595	31.76470588
596	31.87919463
597	31.82579564
598	31.77257525
599	31.88647746
600	31.83333333
601	31.94675541
602	31.89368771
603	32.00663350
604	31.95364238
605	32.06611570
606	32.01320132
607	32.12520593
608	32.07236842
609	32.01970443
610	31.96721311
611	31.91489362
612	31.86274510
613	31.97389886
614	32.08469055
615	32.03252033
616	32.14285714
617	32.09076175
618	32.03883495
619	31.98707593
620	31.93548387
621	31.88405797
622	31.83279743
623	31.78170144
624	31.73076923
625	31.68000000
626	31.78913738
627	31.73843700
628	31.68789809
629	31.79650238
630	31.90476190
631	31.85419968
632	31.96202532
633	31.91153239
634	31.86119874
635	31.81102362
636	31.76100629
637	31.71114600
638	31.66144201
639	31.76838811
640	31.87500000
641	31.82527301
642	31.93146417
643	31.88180404
644	31.98757764
645	31.93798450
646	31.88854489
647	31.83925811
648	31.79012346
649	31.74114022
650	31.69230769
651	31.64362519
652	31.74846626
653	31.69984686
654	31.80428135
655	31.75572519
656	31.85975610
657	31.81126332
658	31.91489362
659	31.86646434
660	31.81818182
661	31.77004539
662	31.72205438
663	31.67420814
664	31.77710843
665	31.72932331
666	31.68168168
667	31.63418291
668	31.73652695
669	31.68908819
670	31.64179104
671	31.59463487
672	31.54761905
673	31.50074294
674	31.45400593
675	31.40740741
676	31.36094675
677	31.31462334
678	31.41592920
679	31.36966127
680	31.47058824
681	31.57121880
682	31.52492669
683	31.62518302
684	31.57894737
685	31.53284672
686	31.48688047
687	31.44104803
688	31.39534884
689	31.34978229
690	31.30434783
691	31.25904486
692	31.21387283
693	31.16883117
694	31.26801153
695	31.22302158
696	31.32183908
697	31.27690100
698	31.23209169
699	31.33047210
700	31.28571429
701	31.38373752
702	31.48148148
703	31.57894737
704	31.53409091
705	31.48936170
706	31.44475921
707	31.40028289
708	31.35593220
709	31.31170663
710	31.40845070
711	31.50492264
712	31.46067416
713	31.55680224
714	31.65266106
715	31.60839161
716	31.56424581
717	31.52022315
718	31.47632312
719	31.43254520
720	31.38888889
721	31.34535368
722	31.30193906
723	31.39695712
724	31.35359116
725	31.44827586
726	31.54269972
727	31.49931224
728	31.45604396
729	31.41289438
730	31.36986301
731	31.32694938
732	31.42076503
733	31.51432469
734	31.60762943
735	31.56462585
736	31.52173913
737	31.61465400
738	31.57181572
739	31.66441137
740	31.62162162
741	31.57894737
742	31.67115903
743	31.76312248
744	31.85483871
745	31.81208054
746	31.76943700
747	31.72690763
748	31.68449198
749	31.64218959
750	31.73333333

Final result: 31.7333 ±1.7007
Random chance: 25.0000 ±1.5822