File size: 9,695 Bytes
ca1fd2a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
multiple_choice_score: there are 1548 tasks in prompt
multiple_choice_score: reading tasks.......................................................................................................done
multiple_choice_score: preparing task data...done
multiple_choice_score : calculating TruthfulQA score over 1548 tasks.

task	acc_norm
1	0.00000000
2	0.00000000
3	33.33333333
4	25.00000000
5	40.00000000
6	50.00000000
7	42.85714286
8	50.00000000
9	44.44444444
10	40.00000000
11	45.45454545
12	41.66666667
13	38.46153846
14	35.71428571
15	33.33333333
16	37.50000000
17	35.29411765
18	33.33333333
19	36.84210526
20	35.00000000
21	33.33333333
22	31.81818182
23	30.43478261
24	29.16666667
25	28.00000000
26	26.92307692
27	29.62962963
28	32.14285714
29	34.48275862
30	36.66666667
31	38.70967742
32	37.50000000
33	36.36363636
34	35.29411765
35	37.14285714
36	38.88888889
37	40.54054054
38	39.47368421
39	38.46153846
40	40.00000000
41	39.02439024
42	40.47619048
43	39.53488372
44	38.63636364
45	37.77777778
46	36.95652174
47	38.29787234
48	37.50000000
49	36.73469388
50	38.00000000
51	39.21568627
52	40.38461538
53	41.50943396
54	42.59259259
55	41.81818182
56	41.07142857
57	40.35087719
58	39.65517241
59	40.67796610
60	40.00000000
61	39.34426230
62	38.70967742
63	39.68253968
64	40.62500000
65	40.00000000
66	39.39393939
67	38.80597015
68	38.23529412
69	39.13043478
70	38.57142857
71	38.02816901
72	37.50000000
73	36.98630137
74	37.83783784
75	38.66666667
76	38.15789474
77	37.66233766
78	37.17948718
79	36.70886076
80	37.50000000
81	37.03703704
82	37.80487805
83	38.55421687
84	38.09523810
85	38.82352941
86	39.53488372
87	39.08045977
88	38.63636364
89	38.20224719
90	37.77777778
91	37.36263736
92	36.95652174
93	37.63440860
94	38.29787234
95	37.89473684
96	37.50000000
97	37.11340206
98	36.73469388
99	37.37373737
100	38.00000000
101	38.61386139
102	38.23529412
103	37.86407767
104	37.50000000
105	38.09523810
106	37.73584906
107	38.31775701
108	37.96296296
109	37.61467890
110	38.18181818
111	38.73873874
112	39.28571429
113	39.82300885
114	39.47368421
115	40.00000000
116	40.51724138
117	41.02564103
118	41.52542373
119	41.17647059
120	40.83333333
121	41.32231405
122	40.98360656
123	41.46341463
124	41.12903226
125	41.60000000
126	41.26984127
127	41.73228346
128	41.40625000
129	41.08527132
130	40.76923077
131	40.45801527
132	40.90909091
133	41.35338346
134	41.79104478
135	41.48148148
136	41.91176471
137	41.60583942
138	41.30434783
139	41.00719424
140	40.71428571
141	40.42553191
142	40.14084507
143	39.86013986
144	39.58333333
145	39.31034483
146	39.04109589
147	38.77551020
148	38.51351351
149	38.25503356
150	38.00000000
151	37.74834437
152	38.15789474
153	37.90849673
154	38.31168831
155	38.70967742
156	39.10256410
157	39.49044586
158	39.87341772
159	40.25157233
160	40.00000000
161	40.37267081
162	40.12345679
163	39.87730061
164	40.24390244
165	40.60606061
166	40.36144578
167	40.11976048
168	40.47619048
169	40.23668639
170	40.58823529
171	40.93567251
172	41.27906977
173	41.04046243
174	41.37931034
175	41.14285714
176	41.47727273
177	41.24293785
178	41.57303371
179	41.89944134
180	42.22222222
181	42.54143646
182	42.85714286
183	42.62295082
184	42.93478261
185	42.70270270
186	42.47311828
187	42.78074866
188	42.55319149
189	42.32804233
190	42.10526316
191	42.40837696
192	42.70833333
193	43.00518135
194	43.29896907
195	43.07692308
196	43.36734694
197	43.65482234
198	43.43434343
199	43.21608040
200	43.50000000
201	43.28358209
202	43.56435644
203	43.34975369
204	43.62745098
205	43.41463415
206	43.20388350
207	42.99516908
208	43.26923077
209	43.54066986
210	43.33333333
211	43.60189573
212	43.39622642
213	43.19248826
214	42.99065421
215	42.79069767
216	42.59259259
217	42.39631336
218	42.20183486
219	42.00913242
220	41.81818182
221	41.62895928
222	41.89189189
223	41.70403587
224	41.51785714
225	41.77777778
226	42.03539823
227	42.29074890
228	42.10526316
229	42.35807860
230	42.17391304
231	41.99134199
232	41.81034483
233	41.63090129
234	41.45299145
235	41.70212766
236	41.94915254
237	42.19409283
238	42.01680672
239	41.84100418
240	41.66666667
241	41.90871369
242	41.73553719
243	41.97530864
244	41.80327869
245	42.04081633
246	41.86991870
247	42.10526316
248	41.93548387
249	41.76706827
250	41.60000000
251	41.43426295
252	41.26984127
253	41.50197628
254	41.33858268
255	41.17647059
256	41.01562500
257	40.85603113
258	40.69767442
259	40.92664093
260	40.76923077
261	40.99616858
262	40.83969466
263	41.06463878
264	41.28787879
265	41.13207547
266	41.35338346
267	41.57303371
268	41.79104478
269	41.63568773
270	41.85185185
271	42.06642066
272	42.27941176
273	42.12454212
274	41.97080292
275	42.18181818
276	42.02898551
277	41.87725632
278	42.08633094
279	41.93548387
280	42.14285714
281	41.99288256
282	42.19858156
283	42.40282686
284	42.60563380
285	42.45614035
286	42.30769231
287	42.50871080
288	42.36111111
289	42.21453287
290	42.41379310
291	42.26804124
292	42.12328767
293	41.97952218
294	41.83673469
295	41.69491525
296	41.55405405
297	41.41414141
298	41.27516779
299	41.13712375
300	41.00000000
301	40.86378738
302	40.72847682
303	40.59405941
304	40.46052632
305	40.32786885
306	40.52287582
307	40.39087948
308	40.25974026
309	40.12944984
310	40.00000000
311	40.19292605
312	40.38461538
313	40.57507987
314	40.76433121
315	40.95238095
316	40.82278481
317	40.69400631
318	40.88050314
319	41.06583072
320	40.93750000
321	40.80996885
322	40.68322981
323	40.86687307
324	40.74074074
325	40.92307692
326	40.79754601
327	40.97859327
328	40.85365854
329	40.72948328
330	40.60606061
331	40.48338369
332	40.66265060
333	40.54054054
334	40.41916168
335	40.59701493
336	40.47619048
337	40.35608309
338	40.23668639
339	40.11799410
340	40.00000000
341	39.88269795
342	40.05847953
343	40.23323615
344	40.40697674
345	40.28985507
346	40.46242775
347	40.63400576
348	40.51724138
349	40.68767908
350	40.57142857
351	40.45584046
352	40.62500000
353	40.50991501
354	40.39548023
355	40.28169014
356	40.16853933
357	40.33613445
358	40.22346369
359	40.11142061
360	40.00000000
361	39.88919668
362	40.05524862
363	39.94490358
364	40.10989011
365	40.00000000
366	40.16393443
367	40.05449591
368	40.21739130
369	40.37940379
370	40.27027027
371	40.16172507
372	40.05376344
373	40.21447721
374	40.37433155
375	40.26666667
376	40.15957447
377	40.05305040
378	39.94708995
379	39.84168865
380	39.73684211
381	39.89501312
382	40.05235602
383	39.94778068
384	40.10416667
385	40.00000000
386	39.89637306
387	40.05167959
388	39.94845361
389	40.10282776
390	40.25641026
391	40.15345269
392	40.05102041
393	40.20356234
394	40.10152284
395	40.00000000
396	39.89898990
397	40.05037783
398	39.94974874
399	40.10025063
400	40.25000000
401	40.14962594
402	40.04975124
403	40.19851117
404	40.09900990
405	40.00000000
406	39.90147783
407	40.04914005
408	39.95098039
409	40.09779951
410	40.00000000
411	39.90267640
412	39.80582524
413	39.95157385
414	39.85507246
415	39.75903614
416	39.90384615
417	39.80815348
418	39.71291866
419	39.61813842
420	39.52380952
421	39.42992874
422	39.33649289
423	39.24349882
424	39.38679245
425	39.29411765
426	39.20187793
427	39.34426230
428	39.25233645
429	39.39393939
430	39.30232558
431	39.44315545
432	39.35185185
433	39.26096998
434	39.17050691
435	39.08045977
436	38.99082569
437	39.13043478
438	39.04109589
439	38.95216401
440	39.09090909
441	39.22902494
442	39.14027149
443	39.27765237
444	39.18918919
445	39.32584270
446	39.46188341
447	39.37360179
448	39.28571429
449	39.19821826
450	39.11111111
451	39.02439024
452	38.93805310
453	38.85209713
454	38.98678414
455	39.12087912
456	39.25438596
457	39.16849015
458	39.08296943
459	39.21568627
460	39.34782609
461	39.47939262
462	39.39393939
463	39.30885529
464	39.22413793
465	39.13978495
466	39.05579399
467	39.18629550
468	39.31623932
469	39.23240938
470	39.36170213
471	39.49044586
472	39.61864407
473	39.53488372
474	39.45147679
475	39.36842105
476	39.49579832
477	39.41299790
478	39.53974895
479	39.45720251
480	39.58333333
481	39.50103950
482	39.62655602
483	39.75155280
484	39.66942149
485	39.58762887
486	39.71193416
487	39.63039014
488	39.54918033
489	39.46830266
490	39.38775510
491	39.30753564
492	39.43089431
493	39.35091278
494	39.27125506
495	39.19191919
496	39.11290323
497	39.03420523
498	38.95582329
499	39.07815631
500	39.00000000
501	38.92215569
502	38.84462151
503	38.76739563
504	38.69047619
505	38.61386139
506	38.53754941
507	38.65877712
508	38.77952756
509	38.70333988
510	38.62745098
511	38.55185910
512	38.47656250
513	38.59649123
514	38.52140078
515	38.64077670
516	38.56589147
517	38.49129594
518	38.61003861
519	38.72832370
520	38.84615385
521	38.77159309
522	38.69731801
523	38.81453155
524	38.74045802
525	38.85714286
526	38.78326996
527	38.89943074
528	38.82575758
529	38.94139887
530	39.05660377
531	38.98305085
532	38.90977444
533	39.02439024
534	38.95131086
535	39.06542056
536	39.17910448
537	39.29236499
538	39.21933086
539	39.33209647
540	39.25925926
541	39.37153420
542	39.29889299
543	39.22651934
544	39.33823529
545	39.26605505
546	39.19413919
547	39.12248629
548	39.23357664
549	39.34426230
550	39.45454545
551	39.56442831
552	39.67391304
553	39.60216998
554	39.53068592
555	39.63963964
556	39.74820144
557	39.85637343
558	39.78494624
559	39.71377460
560	39.64285714
561	39.75044563
562	39.67971530
563	39.60923623
564	39.53900709
565	39.64601770
566	39.75265018
567	39.68253968
568	39.78873239
569	39.89455185
570	40.00000000
571	40.10507881
572	40.03496503
573	39.96509599
574	39.89547038
575	39.82608696
576	39.75694444
577	39.68804159
578	39.61937716
579	39.55094991
580	39.65517241
581	39.58691910
582	39.69072165
583	39.62264151
584	39.72602740
585	39.65811966
586	39.59044369
587	39.69335605
588	39.62585034
589	39.55857385
590	39.49152542
591	39.42470389
592	39.35810811