Mxode commited on
Commit
1dc5130
·
verified ·
1 Parent(s): 9172055

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ This is NanoLM-0.3B-Instruct-v1, the first version of NanoLM-0.3B-Instruct. The
20
 
21
  ## Model Details
22
 
23
- The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12. As a result, NanoLM-0.3B-Instruct-v1 has only 0.3 billion parameters, with approximately 180 million non-embedding parameters. Despite this, NanoLM-0.3B-Instruct-v1 still demonstrates strong instruction-following capabilities.
24
 
25
  Here are some examples. For reproducibility purposes, I've set `do_sample` to `False`. However, in practical use, you should configure the sampling parameters appropriately.
26
 
 
20
 
21
  ## Model Details
22
 
23
+ The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12. As a result, NanoLM-0.3B-Instruct-v1 has only 0.3 billion parameters, with approximately **180 million non-embedding parameters**. Despite this, NanoLM-0.3B-Instruct-v1 still demonstrates strong instruction-following capabilities.
24
 
25
  Here are some examples. For reproducibility purposes, I've set `do_sample` to `False`. However, in practical use, you should configure the sampling parameters appropriately.
26