prithivMLmods commited on
Commit
937dac8
·
verified ·
1 Parent(s): 7a4989e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -11,3 +11,7 @@ tags:
11
  - GRPO
12
  ---
13
  ![czxbzdxfcv.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/uXSGnHW3iFqYQ9vGX4ggz.png)
 
 
 
 
 
11
  - GRPO
12
  ---
13
  ![czxbzdxfcv.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/uXSGnHW3iFqYQ9vGX4ggz.png)
14
+
15
+ # **SmolLM2-135M-Grpo-Checkpoint**
16
+
17
+ SmolLM2-135M-Grpo-Checkpoint is fine-tuned based on SmolLM2-135M-Instruct. SmolLM2 demonstrates significant advances over its predecessor, SmolLM1, particularly in instruction following, knowledge, and reasoning. The 135M model was trained on 2 trillion tokens using a diverse combination of datasets: FineWeb-Edu, DCLM, The Stack, along with new filtered datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets.