PrimeIntellect
/

INTELLECT-MATH-SFT

Model card Files Files and versions Community

justus27 commited on 4 days ago

Commit

5966544

·

verified ·

1 Parent(s): 24531ec

Update README.md

Files changed (1) hide show

README.md +29 -3

README.md CHANGED Viewed

@@ -1,3 +1,29 @@
----
-license: mit
----

+---
+license: mit
+---
+# INTELLECT-MATH: State-of-the-Art Mathematical Reasoning through Better Initializations for Reinforcement Learning
+INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the [PRIME-RL](https://github.com/PRIME-RL/PRIME) recipe.
+We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its better synthetic SFT dataset that encourages the model to imitate the reasoning behavior of a strong teacher model, INTELLECT-MATH outperforms Eurus-2-PRIME, the previous state-of-the-art trained with PRIME-RL, and matches its performance with 10x faster training.
+|      | Intellect-Math (Step 255) | Intellect-Math (Step 47) | Eurus-2-Prime (Step 592) | Intellect-Math-SFT | Eurus-2-SFT | Qwen-2.5-Math |
+|----------------|---------------------------:|--------------------------:|--------------------------:|--------------------:|------------:|-------------:|
+| **MATH-500**   | 82.0                      | 81.6                     | 79.2                     | 72.8               | 65.1        | 79.8         |
+| **OLYMPIADBENCH** | 49.5                   | 46.7                     | 42.1                     | 39.1               | 29.8        | 40.7         |
+| **AIME 2024**  | 26.7                      | 26.7                     | 26.7                     | 16.6               | 3.3         | 13.3         |
+| **AMC**        | 60.2                      | 57.8                     | 57.8                     | 45.8               | 30.1        | 50.6         |
+| **MINERVA MATH** | 39.7                    | 37.8                     | 38.6                     | 33.8               | 32.7        | 34.6         |
+| **AVG**        | 51.6                      | 50.1                     | 48.9                     | 41.6               | 32.2        | 43.8         |
+### Links
+- 📜 [Blog Post]()
+- 🔗 [Github]()
+- 🤗 [Hugging Face Collection]()