Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
23 |
NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
|
24 |
|
25 |
* **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
|
26 |
-
* **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs.
|
27 |
|
28 |
NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
|
29 |
|
@@ -39,7 +39,6 @@ NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaM
|
|
39 |
<!-- Provide the basic links for the model. -->
|
40 |
|
41 |
- **Repository:** https://github.com/project-numina/aimo-progress-prize
|
42 |
-
- **Demo:** https://huggingface.co/spaces/AI-MO/math-olympiad-solver
|
43 |
|
44 |
## Intended uses & limitations
|
45 |
|
@@ -108,30 +107,14 @@ If you find NuminaMath 7B TIR is useful in your work, please cite it with:
|
|
108 |
```
|
109 |
@misc{numina_math_7b,
|
110 |
author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
|
111 |
-
title = {NuminaMath 7B
|
112 |
year = {2024},
|
113 |
publisher = {Numina & Hugging Face},
|
114 |
journal = {Hugging Face repository},
|
115 |
-
howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-
|
116 |
}
|
117 |
```
|
118 |
|
119 |
-
This model is a fine-tuned version of [deepseek-ai/deepseek-math-7b-base](https://huggingface.co/deepseek-ai/deepseek-math-7b-base) on the AI-MO/numina-dataset-v1.0-release-candidate-1-preproc dataset.
|
120 |
-
It achieves the following results on the evaluation set:
|
121 |
-
- Loss: 0.4859
|
122 |
-
|
123 |
-
## Model description
|
124 |
-
|
125 |
-
More information needed
|
126 |
-
|
127 |
-
## Intended uses & limitations
|
128 |
-
|
129 |
-
More information needed
|
130 |
-
|
131 |
-
## Training and evaluation data
|
132 |
-
|
133 |
-
More information needed
|
134 |
-
|
135 |
## Training procedure
|
136 |
|
137 |
### Training hyperparameters
|
@@ -150,14 +133,6 @@ The following hyperparameters were used during training:
|
|
150 |
- lr_scheduler_warmup_ratio: 0.1
|
151 |
- num_epochs: 3.0
|
152 |
|
153 |
-
### Training results
|
154 |
-
|
155 |
-
| Training Loss | Epoch | Step | Validation Loss |
|
156 |
-
|:-------------:|:-----:|:-----:|:---------------:|
|
157 |
-
| 0.4814 | 1.0 | 6920 | 0.4942 |
|
158 |
-
| 0.4188 | 2.0 | 13840 | 0.4728 |
|
159 |
-
| 0.3496 | 3.0 | 20760 | 0.4859 |
|
160 |
-
|
161 |
|
162 |
### Framework versions
|
163 |
|
|
|
23 |
NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
|
24 |
|
25 |
* **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
|
26 |
+
* **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs.
|
27 |
|
28 |
NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
|
29 |
|
|
|
39 |
<!-- Provide the basic links for the model. -->
|
40 |
|
41 |
- **Repository:** https://github.com/project-numina/aimo-progress-prize
|
|
|
42 |
|
43 |
## Intended uses & limitations
|
44 |
|
|
|
107 |
```
|
108 |
@misc{numina_math_7b,
|
109 |
author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
|
110 |
+
title = {NuminaMath 7B CoT},
|
111 |
year = {2024},
|
112 |
publisher = {Numina & Hugging Face},
|
113 |
journal = {Hugging Face repository},
|
114 |
+
howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-CoT}}
|
115 |
}
|
116 |
```
|
117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
## Training procedure
|
119 |
|
120 |
### Training hyperparameters
|
|
|
133 |
- lr_scheduler_warmup_ratio: 0.1
|
134 |
- num_epochs: 3.0
|
135 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
136 |
|
137 |
### Framework versions
|
138 |
|