File size: 12,592 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	0.00000000
2	0.00000000
3	0.00000000
4	25.00000000
5	20.00000000
6	16.66666667
7	28.57142857
8	25.00000000
9	22.22222222
10	20.00000000
11	18.18181818
12	16.66666667
13	23.07692308
14	21.42857143
15	26.66666667
16	31.25000000
17	29.41176471
18	27.77777778
19	31.57894737
20	30.00000000
21	28.57142857
22	27.27272727
23	26.08695652
24	25.00000000
25	28.00000000
26	26.92307692
27	25.92592593
28	28.57142857
29	31.03448276
30	33.33333333
31	35.48387097
32	37.50000000
33	36.36363636
34	35.29411765
35	37.14285714
36	36.11111111
37	35.13513514
38	36.84210526
39	35.89743590
40	35.00000000
41	36.58536585
42	35.71428571
43	34.88372093
44	34.09090909
45	33.33333333
46	32.60869565
47	31.91489362
48	33.33333333
49	34.69387755
50	36.00000000
51	35.29411765
52	36.53846154
53	35.84905660
54	35.18518519
55	36.36363636
56	37.50000000
57	36.84210526
58	36.20689655
59	35.59322034
60	35.00000000
61	34.42622951
62	33.87096774
63	33.33333333
64	32.81250000
65	32.30769231
66	33.33333333
67	32.83582090
68	32.35294118
69	31.88405797
70	32.85714286
71	33.80281690
72	34.72222222
73	34.24657534
74	35.13513514
75	36.00000000
76	36.84210526
77	37.66233766
78	38.46153846
79	37.97468354
80	38.75000000
81	38.27160494
82	37.80487805
83	38.55421687
84	39.28571429
85	38.82352941
86	38.37209302
87	39.08045977
88	39.77272727
89	39.32584270
90	38.88888889
91	38.46153846
92	38.04347826
93	37.63440860
94	37.23404255
95	36.84210526
96	37.50000000
97	37.11340206
98	36.73469388
99	36.36363636
100	37.00000000
101	36.63366337
102	36.27450980
103	35.92233010
104	35.57692308
105	35.23809524
106	34.90566038
107	34.57943925
108	34.25925926
109	33.94495413
110	33.63636364
111	33.33333333
112	33.03571429
113	33.62831858
114	34.21052632
115	34.78260870
116	34.48275862
117	34.18803419
118	33.89830508
119	33.61344538
120	33.33333333
121	33.05785124
122	32.78688525
123	32.52032520
124	32.25806452
125	32.80000000
126	32.53968254
127	32.28346457
128	32.03125000
129	31.78294574
130	31.53846154
131	32.06106870
132	31.81818182
133	31.57894737
134	31.34328358
135	31.11111111
136	31.61764706
137	31.38686131
138	31.15942029
139	31.65467626
140	31.42857143
141	31.20567376
142	31.69014085
143	32.16783217
144	32.63888889
145	32.41379310
146	32.19178082
147	31.97278912
148	32.43243243
149	32.21476510
150	32.00000000
151	31.78807947
152	32.23684211
153	32.67973856
154	32.46753247
155	32.25806452
156	32.05128205
157	32.48407643
158	32.91139241
159	32.70440252
160	33.12500000
161	32.91925466
162	32.71604938
163	32.51533742
164	32.31707317
165	32.72727273
166	32.53012048
167	32.93413174
168	32.73809524
169	32.54437870
170	32.35294118
171	32.16374269
172	31.97674419
173	31.79190751
174	31.60919540
175	31.42857143
176	31.25000000
177	31.07344633
178	30.89887640
179	31.28491620
180	31.11111111
181	30.93922652
182	30.76923077
183	31.14754098
184	30.97826087
185	31.35135135
186	31.72043011
187	31.55080214
188	31.38297872
189	31.21693122
190	31.05263158
191	31.41361257
192	31.25000000
193	31.08808290
194	31.44329897
195	31.79487179
196	31.63265306
197	31.47208122
198	31.31313131
199	31.15577889
200	31.00000000
201	31.34328358
202	31.18811881
203	31.03448276
204	30.88235294
205	31.21951220
206	31.06796117
207	30.91787440
208	30.76923077
209	30.62200957
210	30.47619048
211	30.80568720
212	30.66037736
213	30.51643192
214	30.37383178
215	30.23255814
216	30.09259259
217	29.95391705
218	29.81651376
219	29.68036530
220	29.54545455
221	29.41176471
222	29.72972973
223	29.59641256
224	29.46428571
225	29.77777778
226	29.64601770
227	29.95594714
228	30.26315789
229	30.13100437
230	30.00000000
231	29.87012987
232	29.74137931
233	29.61373391
234	29.91452991
235	29.78723404
236	29.66101695
237	29.53586498
238	29.41176471
239	29.28870293
240	29.58333333
241	29.46058091
242	29.75206612
243	30.04115226
244	29.91803279
245	30.20408163
246	30.08130081
247	29.95951417
248	30.24193548
249	30.12048193
250	30.00000000
251	29.88047809
252	29.76190476
253	29.64426877
254	29.52755906
255	29.80392157
256	30.07812500
257	30.35019455
258	30.23255814
259	30.50193050
260	30.38461538
261	30.26819923
262	30.15267176
263	30.03802281
264	29.92424242
265	29.81132075
266	29.69924812
267	29.58801498
268	29.47761194
269	29.36802974
270	29.25925926
271	29.15129151
272	29.41176471
273	29.30402930
274	29.19708029
275	29.09090909
276	29.34782609
277	29.60288809
278	29.85611511
279	29.74910394
280	30.00000000
281	29.89323843
282	29.78723404
283	29.68197880
284	29.57746479
285	29.47368421
286	29.37062937
287	29.61672474
288	29.51388889
289	29.41176471
290	29.65517241
291	29.89690722
292	30.13698630
293	30.37542662
294	30.61224490
295	30.50847458
296	30.40540541
297	30.30303030
298	30.53691275
299	30.43478261
300	30.66666667
301	30.89700997
302	31.12582781
303	31.02310231
304	31.25000000
305	31.47540984
306	31.37254902
307	31.59609121
308	31.49350649
309	31.71521036
310	31.61290323
311	31.83279743
312	32.05128205
313	32.26837061
314	32.16560510
315	32.06349206
316	31.96202532
317	31.86119874
318	31.76100629
319	31.97492163
320	31.87500000
321	31.77570093
322	31.67701863
323	31.57894737
324	31.48148148
325	31.69230769
326	31.59509202
327	31.80428135
328	31.70731707
329	31.61094225
330	31.51515152
331	31.41993958
332	31.62650602
333	31.53153153
334	31.43712575
335	31.64179104
336	31.54761905
337	31.45400593
338	31.36094675
339	31.26843658
340	31.17647059
341	31.37829912
342	31.28654971
343	31.19533528
344	31.10465116
345	31.01449275
346	31.21387283
347	31.41210375
348	31.32183908
349	31.23209169
350	31.42857143
351	31.33903134
352	31.25000000
353	31.44475921
354	31.35593220
355	31.26760563
356	31.46067416
357	31.37254902
358	31.28491620
359	31.19777159
360	31.38888889
361	31.57894737
362	31.76795580
363	31.95592287
364	31.86813187
365	31.78082192
366	31.69398907
367	31.60762943
368	31.52173913
369	31.43631436
370	31.62162162
371	31.53638814
372	31.45161290
373	31.36729223
374	31.55080214
375	31.73333333
376	31.64893617
377	31.56498674
378	31.48148148
379	31.39841689
380	31.31578947
381	31.49606299
382	31.41361257
383	31.59268930
384	31.51041667
385	31.68831169
386	31.60621762
387	31.52454780
388	31.44329897
389	31.61953728
390	31.79487179
391	31.96930946
392	32.14285714
393	32.06106870
394	31.97969543
395	31.89873418
396	32.07070707
397	31.98992443
398	31.90954774
399	32.08020050
400	32.00000000
401	32.16957606
402	32.08955224
403	32.00992556
404	31.93069307
405	31.85185185
406	31.77339901
407	31.69533170
408	31.61764706
409	31.78484108
410	31.70731707
411	31.63017032
412	31.55339806
413	31.71912833
414	31.88405797
415	32.04819277
416	31.97115385
417	31.89448441
418	31.81818182
419	31.98090692
420	32.14285714
421	32.06650831
422	31.99052133
423	31.91489362
424	32.07547170
425	32.00000000
426	31.92488263
427	31.85011710
428	31.77570093
429	31.93473193
430	31.86046512
431	32.01856148
432	32.17592593
433	32.10161663
434	32.25806452
435	32.18390805
436	32.11009174
437	32.03661327
438	32.19178082
439	32.11845103
440	32.04545455
441	31.97278912
442	31.90045249
443	31.82844244
444	31.75675676
445	31.68539326
446	31.61434978
447	31.54362416
448	31.47321429
449	31.40311804
450	31.33333333
451	31.26385809
452	31.41592920
453	31.56732892
454	31.49779736
455	31.42857143
456	31.57894737
457	31.50984683
458	31.44104803
459	31.59041394
460	31.73913043
461	31.67028200
462	31.60173160
463	31.53347732
464	31.46551724
465	31.39784946
466	31.33047210
467	31.47751606
468	31.41025641
469	31.55650320
470	31.70212766
471	31.63481953
472	31.56779661
473	31.71247357
474	31.64556962
475	31.57894737
476	31.72268908
477	31.65618449
478	31.79916318
479	31.94154489
480	31.87500000
481	31.80873181
482	31.74273859
483	31.67701863
484	31.61157025
485	31.54639175
486	31.48148148
487	31.62217659
488	31.55737705
489	31.49284254
490	31.63265306
491	31.77189409
492	31.70731707
493	31.64300203
494	31.78137652
495	31.91919192
496	32.05645161
497	31.99195171
498	31.92771084
499	31.86372745
500	32.00000000
501	31.93612774
502	31.87250996
503	31.80914513
504	31.94444444
505	31.88118812
506	32.01581028
507	31.95266272
508	31.88976378
509	31.82711198
510	31.96078431
511	31.89823875
512	31.83593750
513	31.77387914
514	31.71206226
515	31.65048544
516	31.58914729
517	31.72147002
518	31.85328185
519	31.79190751
520	31.73076923
521	31.66986564
522	31.80076628
523	31.93116635
524	32.06106870
525	32.00000000
526	31.93916350
527	32.06831120
528	32.00757576
529	31.94706994
530	32.07547170
531	32.01506591
532	31.95488722
533	31.89493433
534	31.83520599
535	31.96261682
536	31.90298507
537	32.02979516
538	32.15613383
539	32.09647495
540	32.03703704
541	32.16266174
542	32.28782288
543	32.22836096
544	32.16911765
545	32.11009174
546	32.05128205
547	31.99268739
548	32.11678832
549	32.05828780
550	32.00000000
551	31.94192377
552	31.88405797
553	31.82640145
554	31.76895307
555	31.71171171
556	31.65467626
557	31.59784560
558	31.72043011
559	31.66368515
560	31.60714286
561	31.55080214
562	31.49466192
563	31.61634103
564	31.56028369
565	31.50442478
566	31.44876325
567	31.39329806
568	31.51408451
569	31.45869947
570	31.40350877
571	31.52364273
572	31.46853147
573	31.41361257
574	31.53310105
575	31.47826087
576	31.59722222
577	31.71577123
578	31.83391003
579	31.77892919
580	31.72413793
581	31.66953528
582	31.61512027
583	31.56089194
584	31.67808219
585	31.62393162
586	31.56996587
587	31.51618399
588	31.46258503
589	31.40916808
590	31.35593220
591	31.30287648
592	31.25000000
593	31.19730185
594	31.31313131
595	31.26050420
596	31.37583893
597	31.32328308
598	31.27090301
599	31.38564274
600	31.33333333
601	31.44758735
602	31.39534884
603	31.50912106
604	31.45695364
605	31.57024793
606	31.51815182
607	31.63097199
608	31.57894737
609	31.52709360
610	31.47540984
611	31.42389525
612	31.37254902
613	31.32137031
614	31.43322476
615	31.38211382
616	31.49350649
617	31.60453809
618	31.55339806
619	31.50242326
620	31.45161290
621	31.40096618
622	31.51125402
623	31.46067416
624	31.41025641
625	31.36000000
626	31.46964856
627	31.41945774
628	31.36942675
629	31.47853736
630	31.42857143
631	31.37876387
632	31.48734177
633	31.43759874
634	31.38801262
635	31.33858268
636	31.28930818
637	31.24018838
638	31.19122257
639	31.29890454
640	31.40625000
641	31.35725429
642	31.46417445
643	31.41524106
644	31.52173913
645	31.47286822
646	31.42414861
647	31.37557960
648	31.32716049
649	31.27889060
650	31.23076923
651	31.18279570
652	31.28834356
653	31.24042879
654	31.34556575
655	31.29770992
656	31.40243902
657	31.50684932
658	31.61094225
659	31.56297420
660	31.51515152
661	31.46747352
662	31.41993958
663	31.37254902
664	31.32530120
665	31.27819549
666	31.23123123
667	31.33433283
668	31.43712575
669	31.39013453
670	31.34328358
671	31.29657228
672	31.25000000
673	31.20356612
674	31.15727003
675	31.11111111
676	31.06508876
677	31.01920236
678	30.97345133
679	30.92783505
680	31.02941176
681	30.98384728
682	30.93841642
683	31.03953148
684	30.99415205
685	30.94890511
686	30.90379009
687	30.85880640
688	30.81395349
689	30.76923077
690	30.72463768
691	30.82489146
692	30.78034682
693	30.73593074
694	30.69164265
695	30.64748201
696	30.74712644
697	30.70301291
698	30.65902579
699	30.75822604
700	30.71428571
701	30.81312411
702	30.91168091
703	31.00995733
704	30.96590909
705	30.92198582
706	30.87818697
707	30.83451202
708	30.79096045
709	30.74753173
710	30.84507042
711	30.94233474
712	30.89887640
713	30.99579243
714	31.09243697
715	31.04895105
716	31.00558659
717	30.96234310
718	30.91922006
719	30.87621697
720	30.83333333
721	30.79056865
722	30.74792244
723	30.84370678
724	30.80110497
725	30.89655172
726	30.99173554
727	30.94910591
728	30.90659341
729	30.86419753
730	30.82191781
731	30.77975376
732	30.87431694
733	30.96862210
734	30.92643052
735	30.88435374
736	30.84239130
737	30.93622795
738	30.89430894
739	30.98782138
740	31.08108108
741	31.03913630
742	31.13207547
743	31.22476447
744	31.31720430
745	31.27516779
746	31.23324397
747	31.19143240
748	31.14973262
749	31.10814419
750	31.20000000

Final result: 31.2000 ±1.6929
Random chance: 25.0000 ±1.5822