maciek-pioro
commited on
Commit
•
8eb627b
1
Parent(s):
709807e
Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,13 @@ tags: []
|
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
|
12 |
## Model Details
|
|
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
+
XXX is a [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model fine-tuned using 2.2B Polish
|
10 |
+
tokens selected from the [SpeakLeash](https://speakleash.org/).
|
11 |
+
This is, to our knowledge, the first open-weights MoE model fine-tuned on Polish data.
|
12 |
+
In order to preserve English capabilities, we include about 600M tokens from the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T).
|
13 |
+
|
14 |
+
The training was made possible thanks to TPU Research Cloud program. The model was trained on a TPUv3-256.
|
15 |
+
|
16 |
|
17 |
|
18 |
## Model Details
|