PhilipMay commited on
Commit
349e1a4
·
verified ·
1 Parent(s): cd2e0ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -8,6 +8,10 @@ It is based on [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/me
8
  was created with the help of [mergekit](https://github.com/arcee-ai/mergekit).
9
  This is the mergekit configuration we used: [mergekit_moe_config.yml](https://huggingface.co/deutsche-telekom/Llama-3.1-MoE-8x8B-Instruct-raw/blob/main/mergekit_moe_config.yml)
10
 
 
 
 
 
11
  ## Licensing
12
 
13
  This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\
 
8
  was created with the help of [mergekit](https://github.com/arcee-ai/mergekit).
9
  This is the mergekit configuration we used: [mergekit_moe_config.yml](https://huggingface.co/deutsche-telekom/Llama-3.1-MoE-8x8B-Instruct-raw/blob/main/mergekit_moe_config.yml)
10
 
11
+ It should be noted that this model is the raw model after merging.
12
+ It still has randomly initialized router networks and will not be better than a single one of its expert models.
13
+ This model requires further training before use.
14
+
15
  ## Licensing
16
 
17
  This model is licensed under the Llama 3.1 Community License, Copyright (c) 2024 [Philip May](https://philipmay.org), [Deutsche Telekom AG](https://www.telekom.de/)\