appvoid
/

niro-preview-2409

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

appvoid commited on Sep 19, 2024

Commit

547cf26

·

verified ·

1 Parent(s): ad92d5a

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -9,4 +9,14 @@ tags:
 arco ultra is an improvement over [WizardLM-Evol-V2-Unfiltered](https://huggingface.co/trollek/danube2-1.8b-WizardLM-Evol-V2-Unfiltered) which at the time of writting is a state-of-the-art 1.8 billion parameters mistral language model. The model was first merged with another sota model ([Synthia-v1.3](https://huggingface.co/trollek/danube2-1.8b-Synthia-v1.3)) in order to get a balanced, performant model.
-Post-merging was used just like with palmer models (merge, finetune later).

 arco ultra is an improvement over [WizardLM-Evol-V2-Unfiltered](https://huggingface.co/trollek/danube2-1.8b-WizardLM-Evol-V2-Unfiltered) which at the time of writting is a state-of-the-art 1.8 billion parameters mistral language model. The model was first merged with another sota model ([Synthia-v1.3](https://huggingface.co/trollek/danube2-1.8b-Synthia-v1.3)) in order to get a balanced, performant model.
+Post-merging was used just like with palmer models (merge, finetune later).
+#### benchmarks
+zero-shot evaluations performed on current sota ~0.5b models.
+| Parameters | Model                          | MMLU  | ARC-C | HellaSwag | PIQA   | Winogrande | Average |
+| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
+| 0.5b       | arco                           |26.17|37.29|62.88|74.37|62.27|52.60|
+| 1.8B       | wizard                         |40.79|40.87|71.85|**78.02**|64.33| 59.17|
+| 1.8B       | niro                           |**41.75**|**40.96**|**72.07**|77.97|**65.51**|**59.65**|