File size: 2,222 Bytes
687c6a7 d4ca45d 687c6a7 d4ca45d 687c6a7 d84b043 687c6a7 3c17d2f 9a7a256 d4ca45d 3c17d2f 4545113 3c17d2f e5b8545 64c4415 3c17d2f 4545113 64c4415 d4ca45d 83b699f d4ca45d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
license: apache-2.0
---
<style>
img{
user-select: none;
transition: all 0.2s ease;
border-radius: .5rem;
}
img:hover{
transform: rotate(2deg);
filter: invert(100%);
}
@import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap');
</style>
<div style="background-color: transparent; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;">
![cubby](https://huggingface.co./appvoid/cubby/resolve/main/cubby.webp)
This is a passthrough of arco with an experimental model. It improved on arc challenge, only missing 1.2 points to get to the level of modern 3b baseline performance.
If you prefer answering multilingual, general knowledge, trivially simple questions chose qwen or llama. If you prefer solving trivially simple english tasks while being half the size, chose arco.
#### prompt
there is no prompt intentionally set.
#### benchmarks
zero-shot results from state-of-the-art small language models
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
| 0.5b | qwen 2 |44.13| 28.92| 49.05 | 69.31 | 56.99 | 49.68 |
| 0.3b | smollm |25.52| 37.71| 56.41| 71.93| 59.27| 50.17 |
| 0.5b | danube 3 | 24.81| 36.18| 60.46| 73.78 | 61.01 | 51.25 |
| 0.5b | qwen 2.5 |**47.29**|31.83|52.17|70.29|57.06|51.72|
| 0.5b | arco |26.17|37.29|62.88|74.37|**62.27**|52.60|
| 0.5b | arco 2 |25.51|**38.82**|**63.02**|**74.70**|61.25|**52.66**|
#### supporters
<a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 34px !important; margin-top: -4px;width: 128px !important; filter: contrast(2) grayscale(100%) brightness(100%);" ></a>
### trivia
arco also means "arc optimized" hence the focus on this cognitive-based benchmark.
</div> |