INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe.

We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its better synthetic SFT dataset that encourages the model to imitate the reasoning behavior of a strong teacher model, INTELLECT-MATH outperforms Eurus-2-PRIME, the previous state-of-the-art trained with PRIME-RL, and matches its performance with 10x faster training.

Intellect-Math (Step 255) Intellect-Math (Step 47) Eurus-2-Prime (Step 592) Intellect-Math-SFT Eurus-2-SFT Qwen-2.5-Math
MATH-500 82.0 81.6 79.2 72.8 65.1 79.8
OLYMPIADBENCH 49.5 46.7 42.1 39.1 29.8 40.7
AIME 2024 26.7 26.7 26.7 16.6 3.3 13.3
AMC 60.2 57.8 57.8 45.8 30.1 50.6
MINERVA MATH 39.7 37.8 38.6 33.8 32.7 34.6
AVG 51.6 50.1 48.9 41.6 32.2 43.8

Links

Downloads last month
319
Safetensors
Model size
7.62B params
Tensor type
F32
ยท
Inference API
Unable to determine this model's library. Check the docs .

Collection including PrimeIntellect/INTELLECT-MATH