prithivMLmods commited on
Commit
17f5f78
·
verified ·
1 Parent(s): b87f9c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -16,9 +16,7 @@ tags:
16
 
17
  SmolLM2-135M-Grpo-Checkpoint is fine-tuned based on SmolLM2-135M-Instruct. SmolLM2 demonstrates significant advances over its predecessor, SmolLM1, particularly in instruction following, knowledge, and reasoning. The 135M model was trained on 2 trillion tokens using a diverse combination of datasets: FineWeb-Edu, DCLM, The Stack, along with new filtered datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets.
18
 
19
- # **SmolLM2-135M-Grpo-Checkpoint**
20
-
21
- SmolLM2-135M-Grpo-Checkpoint is fine-tuned based on SmolLM2-135M-Instruct. SmolLM2 demonstrates significant advances over its predecessor, SmolLM1, particularly in instruction following, knowledge, and reasoning. The 135M model was trained on 2 trillion tokens using a diverse combination of datasets: FineWeb-Edu, DCLM, The Stack, along with new filtered datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets.
22
 
23
  ### **SmolLM2 135M Grpo Fine-tuning**
24
 
 
16
 
17
  SmolLM2-135M-Grpo-Checkpoint is fine-tuned based on SmolLM2-135M-Instruct. SmolLM2 demonstrates significant advances over its predecessor, SmolLM1, particularly in instruction following, knowledge, and reasoning. The 135M model was trained on 2 trillion tokens using a diverse combination of datasets: FineWeb-Edu, DCLM, The Stack, along with new filtered datasets that we curated and will release soon. We developed the instruct version through supervised fine-tuning (SFT) using a combination of public datasets and our own curated datasets.
18
 
19
+ ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/8AQZsg8B_p0U44FHnjE0F.png)
 
 
20
 
21
  ### **SmolLM2 135M Grpo Fine-tuning**
22