ryanmarten commited on
Commit
08b542b
Β·
verified Β·
1 Parent(s): 1a467c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -9
README.md CHANGED
@@ -26,15 +26,23 @@ The dataset is derived by distilling DeepSeek-R1 using the [data pipeline availa
26
  More info about the dataset can be found on the dataset card at [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
27
 
28
  This model improves upon the [Bespoke-Stratos-7B model](https://huggingface.co/bespokelabs/Bespoke-Stratos-7B), which used 17k examples ([Bespoke-Stratos-17k dataset](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k)).
29
-
30
-
31
- | | AIME24 | MATH500 | GPQA-D | LCB Easy | LCB Med | LCB Hard | Open Weights | Open Data | Open Code |
32
- |--|--------|---------|--------|----------|---------|----------|--------------|-----------| --------- |
33
- |OpenThinker-7B|26.7|86|40.4|74.2|31.1|3.3|βœ”|[βœ”](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)|[βœ”](https://github.com/open-thoughts/open-thoughts) |
34
- |Bespoke-Stratos-7B|20.0|82.0|37.8|71.4|25.2|1.6|βœ”|[βœ”](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k)|[βœ”](https://github.com/bespokelabsai/curator/tree/main/examples/bespoke-stratos-data-generation)|
35
- |DeepSeek-R1-Distill-Qwen-7B|43.3|89.4|44.9|81.3|42.2|2.4|βœ”|❌|❌|
36
- |gpt-4o-0513|9.3|74.6|49.9|-|-|-|❌|❌|❌|
37
- |o1-mini|63.6|90.0|60.0|-|-|-|❌|❌|❌|
 
 
 
 
 
 
 
 
38
 
39
 
40
  ## Intended uses & limitations
 
26
  More info about the dataset can be found on the dataset card at [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
27
 
28
  This model improves upon the [Bespoke-Stratos-7B model](https://huggingface.co/bespokelabs/Bespoke-Stratos-7B), which used 17k examples ([Bespoke-Stratos-17k dataset](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k)).
29
+ The numbers reported in the table below are independently evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
30
+
31
+ | | AIME2024 | MATH500 | GPQA-Diamond | LCB Easy v2 | LCB Medium v2 | LCB Hard v2 | LCB All v2 |
32
+ | --------------------------- | -------- | ------- | ------------ | ----------- | ------------- | ----------- | ---------- |
33
+ | OpenThinker-7B | 43.3 | 83.0 | 42.4 | 75.3 | 28.6 | 6.5 | 39.9 |
34
+ | Bespoke-Stratos-7B | 16.6 | 79.6 | 38.9 | 71.4 | 25.2 | 0.8 | 35.8 |
35
+ | DeepSeek-R1-Distill-Qwen-7B | 60 | 88.2 | 46.9 | 79.7 | 45.1 | 14.6 | 50.1 |
36
+ | gpt-4o-0513 | 10 | 75.8 | 46.5 | 87.4 | 42.7 | 8.9 | 50.5 |
37
+ | o1-mini | 63 | | 60 | 92.8 | 74.7 | 39.8 | 72.8 |
38
+
39
+ | | Open Weights | Open Data | Open Code |
40
+ |--|--------------|-----------| --------- |
41
+ |OpenThinker-7B|βœ”|[βœ”](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)|[βœ”](https://github.com/open-thoughts/open-thoughts) |
42
+ |Bespoke-Stratos-7B|βœ”|[βœ”](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k)|[βœ”](https://github.com/bespokelabsai/curator/tree/main/examples/bespoke-stratos-data-generation)|
43
+ |DeepSeek-R1-Distill-Qwen-7B|βœ”|❌|❌|
44
+ |gpt-4o-0513|-|❌|❌|❌|
45
+ |o1-mini|-|❌|❌|❌|
46
 
47
 
48
  ## Intended uses & limitations