====== Perplexity statistics ====== Mean PPL(Q) : 27.258871 ±0.268444 Mean PPL(base) : 24.931431 ±0.241228 Cor(ln(PPL(Q)), ln(PPL(base))): 98.83% Mean ln(PPL(Q)/PPL(base)) : 0.089250 ±0.001501 Mean PPL(Q)/PPL(base) : 1.093354 ±0.001641 Mean PPL(Q)-PPL(base) : 2.327441 ±0.047451 ====== KL divergence statistics ====== Mean KLD: 0.099355 ±0.000288 Maximum KLD: 3.087584 99.9% KLD: 0.860643 99.0% KLD: 0.485928 99.0% KLD: 0.485928 Median KLD: 0.067769 10.0% KLD: 0.001206 5.0% KLD: 0.000193 1.0% KLD: 0.000008 Minimum KLD: -0.000121 ====== Token probability statistics ====== Mean Δp: -1.067 ± 0.020 % Maximum Δp: 80.056% 99.9% Δp: 37.342% 99.0% Δp: 21.343% 95.0% Δp: 9.216% 90.0% Δp: 4.374% 75.0% Δp: 0.267% Median Δp: -0.010% 25.0% Δp: -1.733% 10.0% Δp: -8.613% 5.0% Δp: -14.630% 1.0% Δp: -27.490% 0.1% Δp: -44.569% Minimum Δp: -75.190% RMS Δp : 7.613 ± 0.033 % Same top p: 83.490 ± 0.096 %