deutsche-telekom
/

Llama-3.1-MoE-8x8B-Instruct-raw

Text Generation

🇪🇺 Region: EU

Model card Files Files and versions Community

PhilipMay commited on Aug 23, 2024

Commit

3f4a964

·

verified ·

1 Parent(s): 349e1a4

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -12,6 +12,9 @@ It should be noted that this model is the raw model after merging.
 It still has randomly initialized router networks and will not be better than a single one of its expert models.
 This model requires further training before use.
 ## Licensing
 This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\

 It still has randomly initialized router networks and will not be better than a single one of its expert models.
 This model requires further training before use.
+This model has a total of 47.5B params, which is slightly more than the [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
+with its 46.7B params.
 ## Licensing
 This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\