rhysjones commited on
Commit
6e644f3
·
verified ·
1 Parent(s): 5ce02cc

Update with float16 evals

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -15,7 +15,7 @@ library_name: transformers
15
 
16
  A 124M parameter GPT2 model trained with the 10B fineweb-edu dataset using [https://github.com/karpathy/llm.c](https://github.com/karpathy/llm.c)
17
 
18
- Training took 20 hours on a single 4090 GPU, giving the following graphs:
19
 
20
  ![gpt2-124M-edu-fineweb-10B](https://huggingface.co/rhysjones/gpt2-124M-edu-fineweb-10B/resolve/main/graph.png)
21
 
@@ -47,10 +47,10 @@ The model has had no further finetuning.
47
  Evals using [Eleuther AI Harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463) gives:
48
  | Eval Test | Score |
49
  | --------- | ----- |
50
- | arc_challenge (25 shot) | 24.83 |
51
- | gsm8k (5 shot) | 0.00 |
52
- | hellaswag (10 shot) | 32.52 |
53
- | mmlu (5 shot) | 25.95 |
54
  | truthfulqa (0 shot) | 42.45 |
55
- | winogrande (5 shot) | 53.35 |
56
- | **Overall Score** | **29.85** |
 
15
 
16
  A 124M parameter GPT2 model trained with the 10B fineweb-edu dataset using [https://github.com/karpathy/llm.c](https://github.com/karpathy/llm.c)
17
 
18
+ Training took 20 hours on a single 4090 GPU (limited to 350W), giving the following graphs:
19
 
20
  ![gpt2-124M-edu-fineweb-10B](https://huggingface.co/rhysjones/gpt2-124M-edu-fineweb-10B/resolve/main/graph.png)
21
 
 
47
  Evals using [Eleuther AI Harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463) gives:
48
  | Eval Test | Score |
49
  | --------- | ----- |
50
+ | arc_challenge (25 shot) | 24.49 |
51
+ | gsm8k (5 shot) | 0.08 |
52
+ | hellaswag (10 shot) | 32.64 |
53
+ | mmlu (5 shot) | 26.06 |
54
  | truthfulqa (0 shot) | 42.45 |
55
+ | winogrande (5 shot) | 52.17 |
56
+ | **Overall Score** | **29.65** |