Bschleter commited on
Commit
7d91199
·
verified ·
1 Parent(s): b6306e1

Update README.md

Browse files

a simple typo fix, but at intro and easily visible.

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -30,7 +30,7 @@ QwQ is the reasoning model of the Qwen series. Compared with conventional instru
30
  - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
31
  - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
32
  - Number of Parameters: 32.5B
33
- - Number of Paramaters (Non-Embedding): 31.0B
34
  - Number of Layers: 64
35
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
36
  - Context Length: Full 131,072 tokens
 
30
  - Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
31
  - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
32
  - Number of Parameters: 32.5B
33
+ - Number of Parameters (Non-Embedding): 31.0B
34
  - Number of Layers: 64
35
  - Number of Attention Heads (GQA): 40 for Q and 8 for KV
36
  - Context Length: Full 131,072 tokens