lewtun HF staff commited on
Commit
0fb1437
·
verified ·
1 Parent(s): df5b499

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -28
README.md CHANGED
@@ -23,7 +23,7 @@ should probably proofread and complete it, then remove this comment. -->
23
  NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
24
 
25
  * **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
26
- * **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs. Here we followed [Microsoft’s ToRA paper](https://arxiv.org/abs/2309.17452) and prompted GPT-4 to produce solutions in the ToRA format with code execution feedback. Fine-tuning on this data produces a reasoning agent that can solve mathematical problems via a mix of natural language reasoning and use of the Python REPL to compute intermediate results.
27
 
28
  NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
29
 
@@ -39,7 +39,6 @@ NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaM
39
  <!-- Provide the basic links for the model. -->
40
 
41
  - **Repository:** https://github.com/project-numina/aimo-progress-prize
42
- - **Demo:** https://huggingface.co/spaces/AI-MO/math-olympiad-solver
43
 
44
  ## Intended uses & limitations
45
 
@@ -108,30 +107,14 @@ If you find NuminaMath 7B TIR is useful in your work, please cite it with:
108
  ```
109
  @misc{numina_math_7b,
110
  author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
111
- title = {NuminaMath 7B TIR},
112
  year = {2024},
113
  publisher = {Numina & Hugging Face},
114
  journal = {Hugging Face repository},
115
- howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-TIR}}
116
  }
117
  ```
118
 
119
- This model is a fine-tuned version of [deepseek-ai/deepseek-math-7b-base](https://huggingface.co/deepseek-ai/deepseek-math-7b-base) on the AI-MO/numina-dataset-v1.0-release-candidate-1-preproc dataset.
120
- It achieves the following results on the evaluation set:
121
- - Loss: 0.4859
122
-
123
- ## Model description
124
-
125
- More information needed
126
-
127
- ## Intended uses & limitations
128
-
129
- More information needed
130
-
131
- ## Training and evaluation data
132
-
133
- More information needed
134
-
135
  ## Training procedure
136
 
137
  ### Training hyperparameters
@@ -150,14 +133,6 @@ The following hyperparameters were used during training:
150
  - lr_scheduler_warmup_ratio: 0.1
151
  - num_epochs: 3.0
152
 
153
- ### Training results
154
-
155
- | Training Loss | Epoch | Step | Validation Loss |
156
- |:-------------:|:-----:|:-----:|:---------------:|
157
- | 0.4814 | 1.0 | 6920 | 0.4942 |
158
- | 0.4188 | 2.0 | 13840 | 0.4728 |
159
- | 0.3496 | 3.0 | 20760 | 0.4859 |
160
-
161
 
162
  ### Framework versions
163
 
 
23
  NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
24
 
25
  * **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
26
+ * **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs.
27
 
28
  NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
29
 
 
39
  <!-- Provide the basic links for the model. -->
40
 
41
  - **Repository:** https://github.com/project-numina/aimo-progress-prize
 
42
 
43
  ## Intended uses & limitations
44
 
 
107
  ```
108
  @misc{numina_math_7b,
109
  author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
110
+ title = {NuminaMath 7B CoT},
111
  year = {2024},
112
  publisher = {Numina & Hugging Face},
113
  journal = {Hugging Face repository},
114
+ howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-CoT}}
115
  }
116
  ```
117
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
118
  ## Training procedure
119
 
120
  ### Training hyperparameters
 
133
  - lr_scheduler_warmup_ratio: 0.1
134
  - num_epochs: 3.0
135
 
 
 
 
 
 
 
 
 
136
 
137
  ### Framework versions
138