metadata
license: apache-2.0
datasets:
- gair-prox/open-web-math-pro
language:
- en
base_model:
- TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
TinyLlama-1.1B-ProXMath
ArXiv | Data: OpenWebMath-Pro | Code
TinyLlama-1.1B-ProXMath is a math-adapted TinyLlama-1.1B model that is continually pre-trained on OpenWebMath-Pro (a refined version by ProX) for 15B tokens.
Evaluations
ProX models are evaluated on 9 common math reasoning benchmarks.
Model | asdiv | gsm8k | mathqa | mawps | minerva_math | mmlu_stem | sat_math | svamp | tabmwp | average |
---|---|---|---|---|---|---|---|---|---|---|
TinyLlama-1.1B | 18.0 | 2.8 | 14.6 | 20.2 | 3.2 | 16.3 | 21.9 | 10.9 | 12.5 | 13.4 |
TinyLlama-1.1B-ProXMath | 41.9 | 9.0 | 15.6 | 56.9 | 5.6 | 26.8 | 31.2 | 23.8 | 22.2 | 25.7 |
Citation
@article{zhou2024programming,
title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
journal={arXiv preprint arXiv:2409.17115},
year={2024}
}