File size: 12,600 Bytes
6de6927
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 1548 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	33.33333333
4	50.00000000
5	40.00000000
6	33.33333333
7	42.85714286
8	50.00000000
9	44.44444444
10	50.00000000
11	45.45454545
12	41.66666667
13	38.46153846
14	35.71428571
15	40.00000000
16	43.75000000
17	47.05882353
18	44.44444444
19	47.36842105
20	45.00000000
21	47.61904762
22	45.45454545
23	43.47826087
24	45.83333333
25	48.00000000
26	46.15384615
27	48.14814815
28	46.42857143
29	48.27586207
30	50.00000000
31	51.61290323
32	53.12500000
33	51.51515152
34	52.94117647
35	54.28571429
36	55.55555556
37	54.05405405
38	55.26315789
39	53.84615385
40	55.00000000
41	53.65853659
42	52.38095238
43	53.48837209
44	52.27272727
45	53.33333333
46	52.17391304
47	51.06382979
48	52.08333333
49	51.02040816
50	50.00000000
51	49.01960784
52	50.00000000
53	49.05660377
54	48.14814815
55	49.09090909
56	50.00000000
57	50.87719298
58	51.72413793
59	52.54237288
60	51.66666667
61	52.45901639
62	51.61290323
63	50.79365079
64	50.00000000
65	49.23076923
66	48.48484848
67	47.76119403
68	47.05882353
69	46.37681159
70	47.14285714
71	47.88732394
72	47.22222222
73	46.57534247
74	45.94594595
75	46.66666667
76	47.36842105
77	48.05194805
78	48.71794872
79	48.10126582
80	48.75000000
81	49.38271605
82	48.78048780
83	49.39759036
84	50.00000000
85	49.41176471
86	48.83720930
87	49.42528736
88	50.00000000
89	49.43820225
90	48.88888889
91	48.35164835
92	47.82608696
93	47.31182796
94	46.80851064
95	46.31578947
96	46.87500000
97	46.39175258
98	45.91836735
99	45.45454545
100	45.00000000
101	45.54455446
102	45.09803922
103	44.66019417
104	45.19230769
105	45.71428571
106	46.22641509
107	45.79439252
108	46.29629630
109	46.78899083
110	47.27272727
111	46.84684685
112	46.42857143
113	46.01769912
114	46.49122807
115	46.95652174
116	46.55172414
117	47.00854701
118	47.45762712
119	47.05882353
120	46.66666667
121	47.10743802
122	46.72131148
123	46.34146341
124	45.96774194
125	45.60000000
126	45.23809524
127	44.88188976
128	44.53125000
129	44.18604651
130	43.84615385
131	44.27480916
132	44.69696970
133	44.36090226
134	44.02985075
135	43.70370370
136	43.38235294
137	43.06569343
138	42.75362319
139	43.16546763
140	42.85714286
141	43.26241135
142	43.66197183
143	44.05594406
144	43.75000000
145	44.13793103
146	44.52054795
147	44.21768707
148	44.59459459
149	44.96644295
150	44.66666667
151	44.37086093
152	44.73684211
153	44.44444444
154	44.15584416
155	43.87096774
156	43.58974359
157	43.94904459
158	44.30379747
159	44.02515723
160	44.37500000
161	44.09937888
162	43.82716049
163	43.55828221
164	43.29268293
165	43.63636364
166	43.37349398
167	43.71257485
168	43.45238095
169	43.78698225
170	43.52941176
171	43.85964912
172	43.60465116
173	43.35260116
174	43.10344828
175	42.85714286
176	42.61363636
177	42.37288136
178	42.13483146
179	42.45810056
180	42.77777778
181	43.09392265
182	42.85714286
183	43.16939891
184	42.93478261
185	43.24324324
186	43.54838710
187	43.85026738
188	44.14893617
189	44.44444444
190	44.21052632
191	44.50261780
192	44.27083333
193	44.04145078
194	44.32989691
195	44.61538462
196	44.89795918
197	44.67005076
198	44.94949495
199	44.72361809
200	44.50000000
201	44.27860697
202	44.05940594
203	43.84236453
204	43.62745098
205	43.90243902
206	43.68932039
207	43.96135266
208	43.75000000
209	43.54066986
210	43.33333333
211	43.60189573
212	43.39622642
213	43.19248826
214	42.99065421
215	43.25581395
216	43.51851852
217	43.31797235
218	43.57798165
219	43.83561644
220	44.09090909
221	43.89140271
222	44.14414414
223	44.39461883
224	44.64285714
225	44.88888889
226	44.69026549
227	44.49339207
228	44.73684211
229	44.54148472
230	44.34782609
231	44.15584416
232	44.39655172
233	44.20600858
234	44.44444444
235	44.25531915
236	44.06779661
237	43.88185654
238	43.69747899
239	43.51464435
240	43.75000000
241	43.56846473
242	43.80165289
243	44.03292181
244	43.85245902
245	44.08163265
246	43.90243902
247	43.72469636
248	43.95161290
249	43.77510040
250	43.60000000
251	43.42629482
252	43.25396825
253	43.08300395
254	42.91338583
255	42.74509804
256	42.96875000
257	43.19066148
258	43.02325581
259	42.85714286
260	42.69230769
261	42.91187739
262	42.74809160
263	42.58555133
264	42.80303030
265	43.01886792
266	43.23308271
267	43.07116105
268	42.91044776
269	43.12267658
270	42.96296296
271	42.80442804
272	43.01470588
273	42.85714286
274	42.70072993
275	42.54545455
276	42.75362319
277	42.96028881
278	42.80575540
279	42.65232975
280	42.85714286
281	42.70462633
282	42.55319149
283	42.40282686
284	42.25352113
285	42.10526316
286	41.95804196
287	42.16027875
288	42.01388889
289	41.86851211
290	42.06896552
291	42.26804124
292	42.46575342
293	42.32081911
294	42.17687075
295	42.03389831
296	41.89189189
297	41.75084175
298	41.94630872
299	42.14046823
300	42.33333333
301	42.19269103
302	42.05298013
303	41.91419142
304	41.77631579
305	41.96721311
306	41.83006536
307	42.01954397
308	41.88311688
309	41.74757282
310	41.61290323
311	41.47909968
312	41.66666667
313	41.53354633
314	41.40127389
315	41.26984127
316	41.13924051
317	41.32492114
318	41.19496855
319	41.37931034
320	41.25000000
321	41.12149533
322	40.99378882
323	40.86687307
324	40.74074074
325	40.92307692
326	40.79754601
327	40.97859327
328	40.85365854
329	41.03343465
330	40.90909091
331	40.78549849
332	40.66265060
333	40.84084084
334	40.71856287
335	40.89552239
336	41.07142857
337	40.94955490
338	40.82840237
339	40.70796460
340	40.88235294
341	41.05571848
342	41.22807018
343	41.10787172
344	40.98837209
345	40.86956522
346	41.04046243
347	41.21037464
348	41.09195402
349	40.97421203
350	41.14285714
351	41.02564103
352	41.19318182
353	41.35977337
354	41.24293785
355	41.12676056
356	41.01123596
357	41.17647059
358	41.06145251
359	40.94707521
360	41.11111111
361	41.27423823
362	41.43646409
363	41.59779614
364	41.48351648
365	41.36986301
366	41.53005464
367	41.41689373
368	41.30434783
369	41.19241192
370	41.35135135
371	41.23989218
372	41.12903226
373	41.01876676
374	41.17647059
375	41.33333333
376	41.22340426
377	41.11405836
378	41.26984127
379	41.16094987
380	41.05263158
381	40.94488189
382	40.83769634
383	40.99216710
384	40.88541667
385	41.03896104
386	40.93264249
387	40.82687339
388	40.72164948
389	40.87403599
390	41.02564103
391	41.17647059
392	41.32653061
393	41.22137405
394	41.11675127
395	41.26582278
396	41.41414141
397	41.30982368
398	41.20603015
399	41.35338346
400	41.25000000
401	41.39650873
402	41.29353234
403	41.19106700
404	41.08910891
405	41.23456790
406	41.13300493
407	41.03194103
408	40.93137255
409	41.07579462
410	40.97560976
411	40.87591241
412	40.77669903
413	40.67796610
414	40.82125604
415	40.72289157
416	40.62500000
417	40.52757794
418	40.43062201
419	40.57279236
420	40.47619048
421	40.61757720
422	40.52132701
423	40.42553191
424	40.56603774
425	40.47058824
426	40.37558685
427	40.28103044
428	40.18691589
429	40.32634033
430	40.23255814
431	40.37122970
432	40.27777778
433	40.18475751
434	40.09216590
435	40.00000000
436	39.90825688
437	39.81693364
438	39.95433790
439	40.09111617
440	40.00000000
441	40.13605442
442	40.04524887
443	39.95485327
444	39.86486486
445	39.77528090
446	39.68609865
447	39.59731544
448	39.50892857
449	39.42093541
450	39.33333333
451	39.24611973
452	39.38053097
453	39.51434879
454	39.42731278
455	39.34065934
456	39.47368421
457	39.60612691
458	39.51965066
459	39.65141612
460	39.78260870
461	39.69631236
462	39.82683983
463	39.74082073
464	39.87068966
465	39.78494624
466	39.69957082
467	39.82869379
468	39.95726496
469	40.08528785
470	40.21276596
471	40.12738854
472	40.04237288
473	40.16913319
474	40.08438819
475	40.00000000
476	40.12605042
477	40.04192872
478	40.16736402
479	40.08350731
480	40.20833333
481	40.12474012
482	40.04149378
483	39.95859213
484	39.87603306
485	39.79381443
486	39.91769547
487	40.04106776
488	39.95901639
489	39.87730061
490	40.00000000
491	39.91853360
492	39.83739837
493	39.75659229
494	39.87854251
495	40.00000000
496	40.12096774
497	40.04024145
498	39.95983936
499	40.08016032
500	40.20000000
501	40.11976048
502	40.03984064
503	40.15904573
504	40.07936508
505	40.19801980
506	40.31620553
507	40.23668639
508	40.15748031
509	40.07858546
510	40.19607843
511	40.11741683
512	40.03906250
513	39.96101365
514	39.88326848
515	40.00000000
516	40.11627907
517	40.23210832
518	40.34749035
519	40.26974952
520	40.19230769
521	40.11516315
522	40.22988506
523	40.15296367
524	40.07633588
525	40.00000000
526	39.92395437
527	40.03795066
528	40.15151515
529	40.07561437
530	40.18867925
531	40.11299435
532	40.03759398
533	39.96247655
534	39.88764045
535	40.00000000
536	39.92537313
537	40.03724395
538	40.14869888
539	40.07421150
540	40.00000000
541	39.92606285
542	40.03690037
543	39.96316759
544	40.07352941
545	40.00000000
546	39.92673993
547	40.03656307
548	40.14598540
549	40.07285974
550	40.00000000
551	40.10889292
552	40.03623188
553	39.96383363
554	39.89169675
555	39.81981982
556	39.92805755
557	39.85637343
558	39.96415771
559	39.89266547
560	39.82142857
561	39.75044563
562	39.85765125
563	39.96447602
564	39.89361702
565	40.00000000
566	39.92932862
567	39.85890653
568	39.96478873
569	39.89455185
570	39.82456140
571	39.75481611
572	39.68531469
573	39.61605585
574	39.54703833
575	39.47826087
576	39.58333333
577	39.51473137
578	39.61937716
579	39.55094991
580	39.65517241
581	39.58691910
582	39.51890034
583	39.62264151
584	39.72602740
585	39.82905983
586	39.76109215
587	39.69335605
588	39.62585034
589	39.55857385
590	39.49152542
591	39.42470389
592	39.35810811
593	39.46037099
594	39.56228956
595	39.49579832
596	39.59731544
597	39.53098827
598	39.46488294
599	39.56594324
600	39.50000000
601	39.43427621
602	39.36877076
603	39.46932007
604	39.40397351
605	39.50413223
606	39.43894389
607	39.53871499
608	39.47368421
609	39.40886700
610	39.34426230
611	39.44353519
612	39.37908497
613	39.47797716
614	39.57654723
615	39.51219512
616	39.61038961
617	39.54619125
618	39.48220065
619	39.41841680
620	39.35483871
621	39.29146538
622	39.38906752
623	39.32584270
624	39.26282051
625	39.20000000
626	39.29712460
627	39.23444976
628	39.17197452
629	39.26868045
630	39.20634921
631	39.14421553
632	39.24050633
633	39.17851501
634	39.11671924
635	39.05511811
636	38.99371069
637	39.08948195
638	39.02821317
639	38.96713615
640	39.06250000
641	39.00156006
642	38.94080997
643	38.88024883
644	38.81987578
645	38.75968992
646	38.69969040
647	38.63987635
648	38.58024691
649	38.52080123
650	38.46153846
651	38.40245776
652	38.49693252
653	38.43797856
654	38.53211009
655	38.47328244
656	38.41463415
657	38.35616438
658	38.44984802
659	38.39150228
660	38.33333333
661	38.27534039
662	38.21752266
663	38.15987934
664	38.25301205
665	38.34586466
666	38.28828829
667	38.38080960
668	38.47305389
669	38.56502242
670	38.65671642
671	38.74813711
672	38.69047619
673	38.63298663
674	38.57566766
675	38.51851852
676	38.46153846
677	38.40472674
678	38.49557522
679	38.43888071
680	38.52941176
681	38.61967695
682	38.56304985
683	38.50658858
684	38.59649123
685	38.68613139
686	38.62973761
687	38.57350801
688	38.51744186
689	38.46153846
690	38.40579710
691	38.49493488
692	38.43930636
693	38.38383838
694	38.32853026
695	38.27338129
696	38.36206897
697	38.30703013
698	38.39541547
699	38.48354793
700	38.42857143
701	38.37375178
702	38.46153846
703	38.54907539
704	38.63636364
705	38.58156028
706	38.52691218
707	38.47241867
708	38.41807910
709	38.50493653
710	38.45070423
711	38.53727145
712	38.48314607
713	38.56942496
714	38.65546218
715	38.60139860
716	38.54748603
717	38.49372385
718	38.44011142
719	38.38664812
720	38.33333333
721	38.28016644
722	38.22714681
723	38.31258645
724	38.39779006
725	38.48275862
726	38.42975207
727	38.51444292
728	38.46153846
729	38.40877915
730	38.35616438
731	38.30369357
732	38.25136612
733	38.33560709
734	38.28337875
735	38.23129252
736	38.17934783
737	38.26322931
738	38.34688347
739	38.43031123
740	38.37837838
741	38.32658570
742	38.40970350
743	38.49259758
744	38.57526882
745	38.52348993
746	38.47184987
747	38.55421687
748	38.50267380
749	38.45126836
750	38.53333333

Final result: 38.5333 +/- 1.7783
Random chance: 25.0000 +/- 1.5822