niro

niro is an improvement over the excellent WizardLM-Evol-V2-Unfiltered model, which at the time of writting is the best 1.8 billion parameters mistral model. Keep in mind, nero is an un-trained merge, further improvements are yet to come.

benchmarks

zero-shot evaluations performed on current sota small models; mmlu is still the reason qwen models are better on average. Currently, niro is on par with the best language model below 2b parameters.

Parameters Model MMLU ARC HellaSwag PIQA Winogrande Average
0.5b qwen 2.5 47.29 31.83 52.17 70.29 57.06 51.72
0.5b arco 26.17 37.29 62.88 74.37 62.27 52.60
0.5b arco (exp) 25.51 38.82 63.02 74.70 61.25 52.66
1.7b smollm 27.65 46.26 65.74 76.06 60.93 55.33
1.8B niro-preview 41.75 40.96 72.07 77.97 65.51 59.65
1.5b qwen 2.5 58.68 44.71 67.62 75.73 62.67 61.88
Downloads last month
41
Safetensors
Model size
1.83B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including appvoid/niro-preview-2409