--- library_name: transformers tags: - mergekit - merge --- ![niro](https://huggingface.co./appvoid/niro/resolve/main/niro.webp) niro is an improvement over the excellent [WizardLM-Evol-V2-Unfiltered](https://huggingface.co./trollek/danube2-1.8b-WizardLM-Evol-V2-Unfiltered) model, which at the time of writting is the best 1.8 billion parameters mistral model. Keep in mind, nero is an un-trained merge, further improvements are yet to come. #### benchmarks zero-shot evaluations performed on current sota small models; mmlu is still the reason qwen models are better on average. Currently, niro is on par with the best language model below 2b parameters. | Parameters | Model | MMLU | ARC | HellaSwag | PIQA | Winogrande | Average | | -----------|--------------------------------|-------|-------|-----------|--------|------------|---------| | 0.5b | qwen 2.5 |47.29|31.83|52.17|70.29|57.06|51.72| | 0.5b | arco |26.17|37.29|62.88|74.37|62.27|52.60| | 0.5b | arco (exp) |25.51|38.82|63.02|74.70|61.25|52.66| | 1.7b | smollm |27.65|**46.26**|65.74|76.06|60.93| 55.33| | 1.8B | niro-preview |41.75|40.96|**72.07**|**77.97**|**65.51**|**59.65**| | 1.5b | qwen 2.5 |**58.68**|44.71|67.62|75.73|62.67|**61.88**|