Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,9 @@ It should be noted that this model is the raw model after merging.
|
|
12 |
It still has randomly initialized router networks and will not be better than a single one of its expert models.
|
13 |
This model requires further training before use.
|
14 |
|
|
|
|
|
|
|
15 |
## Licensing
|
16 |
|
17 |
This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\
|
|
|
12 |
It still has randomly initialized router networks and will not be better than a single one of its expert models.
|
13 |
This model requires further training before use.
|
14 |
|
15 |
+
This model has a total of 47.5B params, which is slightly more than the [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
|
16 |
+
with its 46.7B params.
|
17 |
+
|
18 |
## Licensing
|
19 |
|
20 |
This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\
|