Muennighoff's picture
Add eval
d522938
raw
history blame
1.07 kB
task,metric,value,err,version
anli_r1,acc,0.326,0.01483050720454104,0
anli_r2,acc,0.333,0.014910846164229863,0
anli_r3,acc,0.3333333333333333,0.013613950010225605,0
arc_challenge,acc,0.26023890784982934,0.012821930225112568,0
arc_challenge,acc_norm,0.30716723549488056,0.013481034054980943,0
arc_easy,acc,0.5888047138047138,0.010096663811817685,0
arc_easy,acc_norm,0.5648148148148148,0.010173216430370911,0
boolq,acc,0.6064220183486239,0.00854467241848691,1
cb,acc,0.39285714285714285,0.0658538889806635,1
cb,f1,0.2880105401844532,,1
copa,acc,0.75,0.04351941398892446,0
hellaswag,acc,0.43308105954989046,0.00494488954549795,0
hellaswag,acc_norm,0.5610436168094005,0.004952454721934803,0
piqa,acc,0.7480957562568009,0.010128421335088681,0
piqa,acc_norm,0.7459194776931447,0.010157271999135046,0
rte,acc,0.5270758122743683,0.030052303463143713,0
sciq,acc,0.872,0.010570133761108654,0
sciq,acc_norm,0.849,0.011328165223341673,0
storycloze_2016,acc,0.6884019241047569,0.010710200919679802,0
winogrande,acc,0.5595895816890292,0.013952330311915605,0