File size: 12,600 Bytes
6de6927
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	50.00000000
9	55.55555556
10	60.00000000
11	54.54545455
12	58.33333333
13	53.84615385
14	50.00000000
15	53.33333333
16	56.25000000
17	58.82352941
18	55.55555556
19	57.89473684
20	60.00000000
21	61.90476190
22	59.09090909
23	56.52173913
24	58.33333333
25	60.00000000
26	57.69230769
27	59.25925926
28	57.14285714
29	58.62068966
30	60.00000000
31	61.29032258
32	62.50000000
33	60.60606061
34	61.76470588
35	62.85714286
36	61.11111111
37	59.45945946
38	60.52631579
39	58.97435897
40	57.50000000
41	56.09756098
42	54.76190476
43	55.81395349
44	54.54545455
45	55.55555556
46	54.34782609
47	53.19148936
48	54.16666667
49	55.10204082
50	54.00000000
51	52.94117647
52	51.92307692
53	50.94339623
54	50.00000000
55	50.90909091
56	51.78571429
57	50.87719298
58	50.00000000
59	49.15254237
60	48.33333333
61	49.18032787
62	48.38709677
63	47.61904762
64	46.87500000
65	46.15384615
66	46.96969697
67	46.26865672
68	45.58823529
69	44.92753623
70	45.71428571
71	46.47887324
72	45.83333333
73	45.20547945
74	45.94594595
75	46.66666667
76	47.36842105
77	48.05194805
78	48.71794872
79	48.10126582
80	48.75000000
81	49.38271605
82	48.78048780
83	49.39759036
84	50.00000000
85	49.41176471
86	48.83720930
87	49.42528736
88	50.00000000
89	49.43820225
90	48.88888889
91	48.35164835
92	47.82608696
93	47.31182796
94	47.87234043
95	47.36842105
96	47.91666667
97	47.42268041
98	46.93877551
99	46.46464646
100	46.00000000
101	46.53465347
102	46.07843137
103	45.63106796
104	46.15384615
105	46.66666667
106	46.22641509
107	45.79439252
108	46.29629630
109	46.78899083
110	47.27272727
111	47.74774775
112	47.32142857
113	46.90265487
114	47.36842105
115	47.82608696
116	47.41379310
117	47.86324786
118	48.30508475
119	47.89915966
120	47.50000000
121	47.93388430
122	47.54098361
123	47.15447154
124	46.77419355
125	47.20000000
126	46.82539683
127	46.45669291
128	46.09375000
129	45.73643411
130	45.38461538
131	45.03816794
132	44.69696970
133	44.36090226
134	44.02985075
135	43.70370370
136	43.38235294
137	43.79562044
138	43.47826087
139	43.88489209
140	43.57142857
141	43.26241135
142	43.66197183
143	44.05594406
144	44.44444444
145	44.82758621
146	45.20547945
147	44.89795918
148	45.27027027
149	44.96644295
150	44.66666667
151	44.37086093
152	44.73684211
153	45.09803922
154	44.80519481
155	44.51612903
156	44.23076923
157	44.58598726
158	44.93670886
159	44.65408805
160	45.00000000
161	44.72049689
162	44.44444444
163	44.17177914
164	43.90243902
165	44.24242424
166	43.97590361
167	44.31137725
168	44.04761905
169	44.37869822
170	44.11764706
171	44.44444444
172	44.18604651
173	43.93063584
174	43.67816092
175	43.42857143
176	43.18181818
177	42.93785311
178	42.69662921
179	43.01675978
180	43.33333333
181	43.64640884
182	43.40659341
183	43.71584699
184	43.47826087
185	43.78378378
186	44.08602151
187	43.85026738
188	44.14893617
189	44.44444444
190	44.21052632
191	44.50261780
192	44.27083333
193	44.04145078
194	44.32989691
195	44.61538462
196	44.89795918
197	44.67005076
198	44.44444444
199	44.22110553
200	44.00000000
201	43.78109453
202	43.56435644
203	43.34975369
204	43.13725490
205	43.41463415
206	43.20388350
207	43.47826087
208	43.26923077
209	43.06220096
210	42.85714286
211	43.12796209
212	42.92452830
213	42.72300469
214	42.52336449
215	42.79069767
216	43.05555556
217	42.85714286
218	43.11926606
219	43.37899543
220	43.63636364
221	43.43891403
222	43.69369369
223	43.94618834
224	43.75000000
225	44.00000000
226	43.80530973
227	43.61233480
228	43.85964912
229	43.66812227
230	43.47826087
231	43.29004329
232	43.53448276
233	43.34763948
234	43.58974359
235	43.40425532
236	43.22033898
237	43.03797468
238	42.85714286
239	42.67782427
240	42.91666667
241	42.73858921
242	42.97520661
243	43.20987654
244	43.03278689
245	43.26530612
246	43.08943089
247	42.91497976
248	43.14516129
249	42.97188755
250	42.80000000
251	42.62948207
252	42.46031746
253	42.29249012
254	42.12598425
255	41.96078431
256	42.18750000
257	42.41245136
258	42.24806202
259	42.08494208
260	42.30769231
261	42.14559387
262	41.98473282
263	41.82509506
264	41.66666667
265	41.88679245
266	42.10526316
267	41.94756554
268	41.79104478
269	42.00743494
270	41.85185185
271	41.69741697
272	41.91176471
273	41.75824176
274	41.60583942
275	41.45454545
276	41.66666667
277	41.87725632
278	41.72661871
279	41.57706093
280	41.78571429
281	41.63701068
282	41.84397163
283	41.69611307
284	41.54929577
285	41.40350877
286	41.25874126
287	41.46341463
288	41.31944444
289	41.17647059
290	41.37931034
291	41.58075601
292	41.78082192
293	41.63822526
294	41.49659864
295	41.35593220
296	41.21621622
297	41.07744108
298	41.27516779
299	41.13712375
300	41.33333333
301	41.19601329
302	41.05960265
303	40.92409241
304	40.78947368
305	40.98360656
306	40.84967320
307	41.04234528
308	40.90909091
309	40.77669903
310	40.64516129
311	40.51446945
312	40.70512821
313	40.57507987
314	40.44585987
315	40.31746032
316	40.18987342
317	40.37854890
318	40.56603774
319	40.75235110
320	40.62500000
321	40.49844237
322	40.37267081
323	40.24767802
324	40.12345679
325	40.30769231
326	40.18404908
327	40.36697248
328	40.24390244
329	40.12158055
330	40.00000000
331	39.87915408
332	40.06024096
333	40.24024024
334	40.11976048
335	40.29850746
336	40.47619048
337	40.35608309
338	40.23668639
339	40.11799410
340	40.29411765
341	40.46920821
342	40.64327485
343	40.52478134
344	40.40697674
345	40.28985507
346	40.46242775
347	40.63400576
348	40.51724138
349	40.68767908
350	40.85714286
351	40.74074074
352	40.62500000
353	40.79320113
354	40.67796610
355	40.56338028
356	40.73033708
357	40.89635854
358	40.78212291
359	40.66852368
360	40.83333333
361	40.72022161
362	40.88397790
363	41.04683196
364	40.93406593
365	40.82191781
366	40.98360656
367	40.87193460
368	40.76086957
369	40.65040650
370	40.81081081
371	40.70080863
372	40.59139785
373	40.48257373
374	40.37433155
375	40.53333333
376	40.42553191
377	40.31830239
378	40.47619048
379	40.36939314
380	40.26315789
381	40.41994751
382	40.57591623
383	40.73107050
384	40.62500000
385	40.77922078
386	40.67357513
387	40.56847545
388	40.46391753
389	40.61696658
390	40.76923077
391	40.92071611
392	41.07142857
393	40.96692112
394	40.86294416
395	41.01265823
396	41.16161616
397	41.05793451
398	40.95477387
399	41.10275689
400	41.00000000
401	41.14713217
402	41.04477612
403	40.94292804
404	40.84158416
405	40.74074074
406	40.64039409
407	40.54054054
408	40.44117647
409	40.58679707
410	40.48780488
411	40.38929440
412	40.29126214
413	40.19370460
414	40.09661836
415	40.00000000
416	40.14423077
417	40.04796163
418	39.95215311
419	40.09546539
420	40.00000000
421	40.14251781
422	40.04739336
423	39.95271868
424	40.09433962
425	40.00000000
426	39.90610329
427	40.04683841
428	39.95327103
429	40.09324009
430	40.00000000
431	40.13921114
432	40.04629630
433	39.95381062
434	39.86175115
435	39.77011494
436	39.67889908
437	39.58810069
438	39.72602740
439	39.86332574
440	39.77272727
441	39.90929705
442	39.81900452
443	39.72911964
444	39.63963964
445	39.55056180
446	39.46188341
447	39.37360179
448	39.28571429
449	39.19821826
450	39.33333333
451	39.46784922
452	39.38053097
453	39.51434879
454	39.42731278
455	39.34065934
456	39.47368421
457	39.38730853
458	39.30131004
459	39.43355120
460	39.56521739
461	39.47939262
462	39.61038961
463	39.52483801
464	39.65517241
465	39.56989247
466	39.48497854
467	39.61456103
468	39.74358974
469	39.87206823
470	39.78723404
471	39.70276008
472	39.61864407
473	39.74630021
474	39.66244726
475	39.57894737
476	39.70588235
477	39.62264151
478	39.74895397
479	39.87473904
480	40.00000000
481	39.91683992
482	39.83402490
483	39.75155280
484	39.66942149
485	39.58762887
486	39.71193416
487	39.83572895
488	39.75409836
489	39.67280164
490	39.79591837
491	39.71486762
492	39.63414634
493	39.55375254
494	39.67611336
495	39.79797980
496	39.91935484
497	39.83903421
498	39.75903614
499	39.67935872
500	39.80000000
501	39.72055888
502	39.64143426
503	39.76143141
504	39.68253968
505	39.60396040
506	39.72332016
507	39.64497041
508	39.56692913
509	39.68565815
510	39.80392157
511	39.72602740
512	39.64843750
513	39.57115010
514	39.49416342
515	39.61165049
516	39.72868217
517	39.84526112
518	39.96138996
519	39.88439306
520	39.80769231
521	39.73128599
522	39.84674330
523	39.77055449
524	39.88549618
525	39.80952381
526	39.73384030
527	39.84819734
528	39.96212121
529	39.88657845
530	40.00000000
531	39.92467043
532	39.84962406
533	39.77485929
534	39.70037453
535	39.81308411
536	39.73880597
537	39.85102421
538	39.96282528
539	39.88868275
540	39.81481481
541	39.74121996
542	39.85239852
543	39.77900552
544	39.88970588
545	39.81651376
546	39.74358974
547	39.85374771
548	39.96350365
549	39.89071038
550	39.81818182
551	39.92740472
552	39.85507246
553	39.78300181
554	39.71119134
555	39.63963964
556	39.74820144
557	39.67684022
558	39.78494624
559	39.89266547
560	40.00000000
561	39.92869875
562	40.03558719
563	40.14209591
564	40.07092199
565	40.17699115
566	40.10600707
567	40.03527337
568	40.14084507
569	40.07029877
570	40.00000000
571	39.92994746
572	39.86013986
573	39.79057592
574	39.72125436
575	39.65217391
576	39.75694444
577	39.86135182
578	39.96539792
579	39.89637306
580	39.82758621
581	39.75903614
582	39.69072165
583	39.79416810
584	39.89726027
585	40.00000000
586	39.93174061
587	39.86371380
588	39.79591837
589	39.72835314
590	39.66101695
591	39.59390863
592	39.52702703
593	39.62900506
594	39.73063973
595	39.66386555
596	39.76510067
597	39.69849246
598	39.63210702
599	39.73288815
600	39.66666667
601	39.76705491
602	39.70099668
603	39.80099502
604	39.73509934
605	39.66942149
606	39.60396040
607	39.70345964
608	39.63815789
609	39.57307061
610	39.50819672
611	39.60720131
612	39.54248366
613	39.64110930
614	39.57654723
615	39.51219512
616	39.61038961
617	39.54619125
618	39.48220065
619	39.41841680
620	39.35483871
621	39.29146538
622	39.38906752
623	39.32584270
624	39.26282051
625	39.20000000
626	39.29712460
627	39.23444976
628	39.17197452
629	39.26868045
630	39.20634921
631	39.30269414
632	39.39873418
633	39.33649289
634	39.27444795
635	39.21259843
636	39.15094340
637	39.24646782
638	39.18495298
639	39.12363067
640	39.21875000
641	39.15756630
642	39.09657321
643	39.03576983
644	39.13043478
645	39.06976744
646	39.00928793
647	38.94899536
648	38.88888889
649	38.82896764
650	38.76923077
651	38.70967742
652	38.65030675
653	38.59111792
654	38.68501529
655	38.62595420
656	38.71951220
657	38.81278539
658	38.90577508
659	38.84673748
660	38.78787879
661	38.72919818
662	38.67069486
663	38.61236802
664	38.70481928
665	38.79699248
666	38.73873874
667	38.83058471
668	38.92215569
669	39.01345291
670	38.95522388
671	39.04619970
672	38.98809524
673	38.93016345
674	38.87240356
675	38.81481481
676	38.75739645
677	38.70014771
678	38.79056047
679	38.73343152
680	38.82352941
681	38.91336270
682	38.85630499
683	38.79941435
684	38.88888889
685	38.97810219
686	38.92128280
687	38.86462882
688	38.80813953
689	38.89695210
690	38.84057971
691	38.78437048
692	38.72832370
693	38.67243867
694	38.61671470
695	38.56115108
696	38.64942529
697	38.59397418
698	38.68194842
699	38.76967096
700	38.71428571
701	38.65905849
702	38.74643875
703	38.83357041
704	38.92045455
705	38.86524823
706	38.81019830
707	38.75530410
708	38.70056497
709	38.64598025
710	38.59154930
711	38.67791842
712	38.62359551
713	38.70967742
714	38.79551821
715	38.74125874
716	38.68715084
717	38.63319386
718	38.57938719
719	38.52573018
720	38.47222222
721	38.41886269
722	38.36565097
723	38.45089903
724	38.53591160
725	38.62068966
726	38.56749311
727	38.65199450
728	38.59890110
729	38.54595336
730	38.49315068
731	38.44049248
732	38.38797814
733	38.47203274
734	38.41961853
735	38.36734694
736	38.31521739
737	38.39891452
738	38.48238482
739	38.56562923
740	38.51351351
741	38.46153846
742	38.54447439
743	38.62718708
744	38.70967742
745	38.65771812
746	38.60589812
747	38.55421687
748	38.50267380
749	38.45126836
750	38.53333333

Final result: 38.5333 +/- 1.7783
Random chance: 25.0000 +/- 1.5822