Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ This is NanoLM-0.3B-Instruct-v1, the first version of NanoLM-0.3B-Instruct. The
|
|
20 |
|
21 |
## Model Details
|
22 |
|
23 |
-
The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12. As a result, NanoLM-0.3B-Instruct-v1 has only 0.3 billion parameters, with approximately 180 million non-embedding parameters
|
24 |
|
25 |
Here are some examples. For reproducibility purposes, I've set `do_sample` to `False`. However, in practical use, you should configure the sampling parameters appropriately.
|
26 |
|
|
|
20 |
|
21 |
## Model Details
|
22 |
|
23 |
+
The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12. As a result, NanoLM-0.3B-Instruct-v1 has only 0.3 billion parameters, with approximately **180 million non-embedding parameters**. Despite this, NanoLM-0.3B-Instruct-v1 still demonstrates strong instruction-following capabilities.
|
24 |
|
25 |
Here are some examples. For reproducibility purposes, I've set `do_sample` to `False`. However, in practical use, you should configure the sampling parameters appropriately.
|
26 |
|