File size: 12,594 Bytes
94d6d71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
common_init_from_params: setting dry_penalty_last_n to ctx_size = 768
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |
multiple_choice_score: there are 869 tasks in prompt
multiple_choice_score: selecting 750 random tasks from 869 tasks available
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 750 tasks.

task	acc_norm
1	100.00000000
2	50.00000000
3	66.66666667
4	50.00000000
5	60.00000000
6	66.66666667
7	57.14285714
8	62.50000000
9	55.55555556
10	50.00000000
11	45.45454545
12	41.66666667
13	38.46153846
14	42.85714286
15	46.66666667
16	43.75000000
17	47.05882353
18	44.44444444
19	42.10526316
20	40.00000000
21	38.09523810
22	40.90909091
23	39.13043478
24	41.66666667
25	40.00000000
26	38.46153846
27	40.74074074
28	39.28571429
29	37.93103448
30	40.00000000
31	38.70967742
32	37.50000000
33	39.39393939
34	41.17647059
35	40.00000000
36	38.88888889
37	40.54054054
38	42.10526316
39	41.02564103
40	42.50000000
41	43.90243902
42	42.85714286
43	44.18604651
44	43.18181818
45	42.22222222
46	43.47826087
47	44.68085106
48	45.83333333
49	44.89795918
50	46.00000000
51	45.09803922
52	46.15384615
53	47.16981132
54	46.29629630
55	45.45454545
56	44.64285714
57	45.61403509
58	46.55172414
59	47.45762712
60	48.33333333
61	49.18032787
62	48.38709677
63	49.20634921
64	48.43750000
65	47.69230769
66	48.48484848
67	47.76119403
68	48.52941176
69	49.27536232
70	48.57142857
71	47.88732394
72	47.22222222
73	47.94520548
74	48.64864865
75	49.33333333
76	50.00000000
77	50.64935065
78	51.28205128
79	50.63291139
80	51.25000000
81	50.61728395
82	51.21951220
83	50.60240964
84	51.19047619
85	51.76470588
86	52.32558140
87	52.87356322
88	53.40909091
89	53.93258427
90	54.44444444
91	54.94505495
92	55.43478261
93	54.83870968
94	54.25531915
95	54.73684211
96	54.16666667
97	53.60824742
98	54.08163265
99	54.54545455
100	55.00000000
101	54.45544554
102	54.90196078
103	54.36893204
104	53.84615385
105	54.28571429
106	53.77358491
107	54.20560748
108	53.70370370
109	53.21100917
110	53.63636364
111	54.05405405
112	54.46428571
113	54.86725664
114	54.38596491
115	54.78260870
116	54.31034483
117	54.70085470
118	55.08474576
119	54.62184874
120	55.00000000
121	54.54545455
122	54.91803279
123	55.28455285
124	54.83870968
125	55.20000000
126	55.55555556
127	55.90551181
128	55.46875000
129	55.03875969
130	54.61538462
131	54.96183206
132	55.30303030
133	55.63909774
134	55.22388060
135	54.81481481
136	54.41176471
137	54.74452555
138	55.07246377
139	55.39568345
140	55.71428571
141	56.02836879
142	56.33802817
143	56.64335664
144	56.25000000
145	56.55172414
146	56.16438356
147	56.46258503
148	56.08108108
149	56.37583893
150	56.66666667
151	56.95364238
152	57.23684211
153	57.51633987
154	57.14285714
155	56.77419355
156	57.05128205
157	57.32484076
158	57.59493671
159	57.86163522
160	57.50000000
161	57.76397516
162	58.02469136
163	57.66871166
164	57.31707317
165	57.57575758
166	57.83132530
167	57.48502994
168	57.73809524
169	57.39644970
170	57.64705882
171	57.30994152
172	56.97674419
173	57.22543353
174	57.47126437
175	57.14285714
176	56.81818182
177	57.06214689
178	56.74157303
179	56.98324022
180	56.66666667
181	56.35359116
182	56.59340659
183	56.28415301
184	55.97826087
185	56.21621622
186	56.45161290
187	56.14973262
188	55.85106383
189	55.55555556
190	55.26315789
191	55.49738220
192	55.20833333
193	54.92227979
194	55.15463918
195	55.38461538
196	55.61224490
197	55.83756345
198	56.06060606
199	56.28140704
200	56.50000000
201	56.71641791
202	56.93069307
203	56.65024631
204	56.86274510
205	56.58536585
206	56.31067961
207	56.52173913
208	56.73076923
209	56.93779904
210	57.14285714
211	57.34597156
212	57.07547170
213	56.80751174
214	57.00934579
215	57.20930233
216	56.94444444
217	56.68202765
218	56.88073394
219	57.07762557
220	56.81818182
221	56.56108597
222	56.30630631
223	56.05381166
224	56.25000000
225	56.00000000
226	55.75221239
227	55.94713656
228	56.14035088
229	56.33187773
230	56.52173913
231	56.27705628
232	56.03448276
233	56.22317597
234	55.98290598
235	55.74468085
236	55.50847458
237	55.27426160
238	55.46218487
239	55.64853556
240	55.83333333
241	56.01659751
242	55.78512397
243	55.55555556
244	55.32786885
245	55.10204082
246	54.87804878
247	55.06072874
248	55.24193548
249	55.02008032
250	55.20000000
251	55.37848606
252	55.55555556
253	55.33596838
254	55.11811024
255	55.29411765
256	55.46875000
257	55.64202335
258	55.81395349
259	55.59845560
260	55.76923077
261	55.93869732
262	56.10687023
263	56.27376426
264	56.43939394
265	56.22641509
266	56.01503759
267	56.17977528
268	55.97014925
269	56.13382900
270	56.29629630
271	56.08856089
272	55.88235294
273	56.04395604
274	55.83941606
275	55.63636364
276	55.79710145
277	55.95667870
278	55.75539568
279	55.55555556
280	55.35714286
281	55.51601423
282	55.67375887
283	55.47703180
284	55.63380282
285	55.43859649
286	55.59440559
287	55.74912892
288	55.90277778
289	55.70934256
290	55.86206897
291	56.01374570
292	55.82191781
293	55.97269625
294	55.78231293
295	55.93220339
296	56.08108108
297	56.22895623
298	56.04026846
299	56.18729097
300	56.00000000
301	55.81395349
302	55.96026490
303	55.77557756
304	55.92105263
305	56.06557377
306	55.88235294
307	56.02605863
308	55.84415584
309	55.98705502
310	56.12903226
311	56.27009646
312	56.08974359
313	56.23003195
314	56.05095541
315	55.87301587
316	56.01265823
317	56.15141956
318	55.97484277
319	55.79937304
320	55.62500000
321	55.45171340
322	55.59006211
323	55.41795666
324	55.55555556
325	55.38461538
326	55.52147239
327	55.35168196
328	55.18292683
329	55.31914894
330	55.45454545
331	55.58912387
332	55.42168675
333	55.55555556
334	55.38922156
335	55.22388060
336	55.05952381
337	54.89614243
338	54.73372781
339	54.86725664
340	55.00000000
341	55.13196481
342	54.97076023
343	55.10204082
344	55.23255814
345	55.07246377
346	55.20231214
347	55.04322767
348	54.88505747
349	55.01432665
350	54.85714286
351	54.98575499
352	54.82954545
353	54.67422096
354	54.51977401
355	54.36619718
356	54.21348315
357	54.06162465
358	54.18994413
359	54.03899721
360	54.16666667
361	54.29362881
362	54.41988950
363	54.54545455
364	54.39560440
365	54.24657534
366	54.09836066
367	53.95095368
368	54.07608696
369	54.20054201
370	54.05405405
371	53.90835580
372	53.76344086
373	53.61930295
374	53.74331551
375	53.86666667
376	53.72340426
377	53.84615385
378	53.96825397
379	53.82585752
380	53.94736842
381	53.80577428
382	53.66492147
383	53.52480418
384	53.38541667
385	53.50649351
386	53.62694301
387	53.48837209
388	53.35051546
389	53.21336761
390	53.33333333
391	53.19693095
392	53.06122449
393	53.18066158
394	53.29949239
395	53.41772152
396	53.28282828
397	53.40050378
398	53.26633166
399	53.13283208
400	53.25000000
401	53.11720698
402	52.98507463
403	53.10173697
404	53.21782178
405	53.33333333
406	53.44827586
407	53.56265356
408	53.67647059
409	53.54523227
410	53.65853659
411	53.77128954
412	53.88349515
413	53.75302663
414	53.86473430
415	53.73493976
416	53.60576923
417	53.47721823
418	53.34928230
419	53.22195704
420	53.33333333
421	53.20665083
422	53.31753555
423	53.42789598
424	53.30188679
425	53.17647059
426	53.05164319
427	52.92740047
428	52.80373832
429	52.91375291
430	53.02325581
431	53.13225058
432	53.00925926
433	52.88683603
434	52.76497696
435	52.87356322
436	52.98165138
437	53.08924485
438	52.96803653
439	53.07517084
440	53.18181818
441	53.28798186
442	53.16742081
443	53.04740406
444	52.92792793
445	53.03370787
446	53.13901345
447	53.24384787
448	53.34821429
449	53.45211581
450	53.33333333
451	53.43680710
452	53.53982301
453	53.42163355
454	53.30396476
455	53.18681319
456	53.07017544
457	52.95404814
458	52.83842795
459	52.94117647
460	53.04347826
461	52.92841649
462	53.03030303
463	52.91576674
464	52.80172414
465	52.68817204
466	52.78969957
467	52.67665953
468	52.77777778
469	52.66524520
470	52.76595745
471	52.86624204
472	52.96610169
473	52.85412262
474	52.74261603
475	52.63157895
476	52.52100840
477	52.62054507
478	52.71966527
479	52.81837161
480	52.70833333
481	52.59875260
482	52.69709544
483	52.79503106
484	52.89256198
485	52.98969072
486	53.08641975
487	53.18275154
488	53.27868852
489	53.16973415
490	53.06122449
491	53.15682281
492	53.04878049
493	52.94117647
494	53.03643725
495	52.92929293
496	52.82258065
497	52.91750503
498	53.01204819
499	53.10621242
500	53.20000000
501	53.09381238
502	53.18725100
503	53.28031809
504	53.17460317
505	53.26732673
506	53.16205534
507	53.05719921
508	53.14960630
509	53.04518664
510	53.13725490
511	53.03326810
512	53.12500000
513	53.02144250
514	53.11284047
515	53.20388350
516	53.10077519
517	52.99806576
518	52.89575290
519	52.79383430
520	52.69230769
521	52.78310940
522	52.87356322
523	52.77246654
524	52.67175573
525	52.57142857
526	52.47148289
527	52.37191651
528	52.27272727
529	52.17391304
530	52.26415094
531	52.35404896
532	52.44360902
533	52.53283302
534	52.43445693
535	52.33644860
536	52.23880597
537	52.32774674
538	52.23048327
539	52.31910946
540	52.22222222
541	52.12569316
542	52.21402214
543	52.30202578
544	52.20588235
545	52.29357798
546	52.19780220
547	52.10237660
548	52.18978102
549	52.27686703
550	52.18181818
551	52.08711434
552	51.99275362
553	52.07956600
554	52.16606498
555	52.25225225
556	52.15827338
557	52.24416517
558	52.15053763
559	52.23613596
560	52.32142857
561	52.40641711
562	52.49110320
563	52.57548845
564	52.48226950
565	52.38938053
566	52.47349823
567	52.55731922
568	52.64084507
569	52.54833040
570	52.45614035
571	52.36427320
572	52.27272727
573	52.35602094
574	52.43902439
575	52.52173913
576	52.60416667
577	52.68630849
578	52.76816609
579	52.84974093
580	52.93103448
581	52.83993115
582	52.74914089
583	52.83018868
584	52.91095890
585	52.99145299
586	53.07167235
587	52.98126065
588	52.89115646
589	52.97113752
590	53.05084746
591	53.13028765
592	53.20945946
593	53.11973019
594	53.19865320
595	53.27731092
596	53.18791946
597	53.09882747
598	53.01003344
599	53.08848080
600	53.00000000
601	52.91181364
602	52.82392027
603	52.73631841
604	52.64900662
605	52.56198347
606	52.64026403
607	52.55354201
608	52.63157895
609	52.54515599
610	52.62295082
611	52.53682488
612	52.45098039
613	52.36541599
614	52.28013029
615	52.35772358
616	52.43506494
617	52.51215559
618	52.42718447
619	52.34248788
620	52.41935484
621	52.33494364
622	52.25080386
623	52.32744783
624	52.24358974
625	52.16000000
626	52.07667732
627	52.15311005
628	52.22929936
629	52.30524642
630	52.38095238
631	52.45641838
632	52.37341772
633	52.29067930
634	52.20820189
635	52.12598425
636	52.20125786
637	52.11930926
638	52.19435737
639	52.11267606
640	52.03125000
641	52.10608424
642	52.02492212
643	51.94401244
644	52.01863354
645	51.93798450
646	51.85758514
647	51.77743431
648	51.69753086
649	51.61787365
650	51.53846154
651	51.61290323
652	51.53374233
653	51.45482389
654	51.52905199
655	51.45038168
656	51.37195122
657	51.29375951
658	51.36778116
659	51.28983308
660	51.36363636
661	51.28593041
662	51.35951662
663	51.43288084
664	51.50602410
665	51.57894737
666	51.65165165
667	51.72413793
668	51.64670659
669	51.56950673
670	51.64179104
671	51.56482861
672	51.48809524
673	51.41158990
674	51.33531157
675	51.25925926
676	51.18343195
677	51.25553914
678	51.32743363
679	51.39911635
680	51.47058824
681	51.39500734
682	51.31964809
683	51.24450952
684	51.16959064
685	51.24087591
686	51.16618076
687	51.09170306
688	51.16279070
689	51.08853411
690	51.15942029
691	51.23010130
692	51.30057803
693	51.37085137
694	51.44092219
695	51.51079137
696	51.43678161
697	51.36298422
698	51.28939828
699	51.21602289
700	51.28571429
701	51.21255350
702	51.28205128
703	51.20910384
704	51.27840909
705	51.34751773
706	51.41643059
707	51.48514851
708	51.55367232
709	51.62200282
710	51.69014085
711	51.75808720
712	51.82584270
713	51.89340813
714	51.82072829
715	51.74825175
716	51.81564246
717	51.74337517
718	51.67130919
719	51.59944367
720	51.66666667
721	51.59500693
722	51.52354571
723	51.59059474
724	51.65745856
725	51.72413793
726	51.65289256
727	51.58184319
728	51.64835165
729	51.57750343
730	51.64383562
731	51.70998632
732	51.77595628
733	51.84174625
734	51.90735695
735	51.83673469
736	51.90217391
737	51.83175034
738	51.76151762
739	51.82679296
740	51.75675676
741	51.82186235
742	51.88679245
743	51.95154778
744	51.88172043
745	51.94630872
746	51.87667560
747	51.80722892
748	51.73796791
749	51.66889186
750	51.73333333

Final result: 51.7333 ±1.8259
Random chance: 25.0083 ±1.5824