maciek-pioro
/

Mixtral-8x7B-v0.1-pl

Feature Extraction

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

maciek-pioro commited on Apr 23

Commit

8eb627b

•

1 Parent(s): 709807e

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -6,7 +6,13 @@ tags: []
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-This is [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model fine-tuned using 10B Speakleash tokens.
 ## Model Details

 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+XXX is a [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model fine-tuned using 2.2B Polish
+tokens selected from the [SpeakLeash](https://speakleash.org/).
+This is, to our knowledge, the first open-weights MoE model fine-tuned on Polish data.
+In order to preserve English capabilities, we include about 600M tokens from the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T).
+The training was made possible thanks to TPU Research Cloud program. The model was trained on a TPUv3-256.
 ## Model Details