ai-forever
/

mGPT-1.3B-kirgiz

@@ -19,18 +19,45 @@ Kirgiz belongs to Turkic language family. It's a very fluid language with approx
 2. It uses a version of the Cyrillic script.
 3. Manas, an epic poem in the Kyrgyz language, is one of the worlds longest epics.
-## Dataset
-TBD
-## Technical details
-TBD
-## Examples of usage
-Try different generation strategies to reach better results.
-TBD
 Model will be improved over time. Stay tuned!

 2. It uses a version of the Cyrillic script.
 3. Manas, an epic poem in the Kyrgyz language, is one of the worlds longest epics.
+## Technical details
+It's one of the models derived from the base [mGPT-XL (1.3B)](https://huggingface.co/ai-forever/mGPT) model (see the list below) which was originally trained on the 61 languages from 25 language families using Wikipedia and C4 corpus.
+We've found additional data for 23 languages most of which are considered as minor and decided to further tune the base model. **Kirgiz mGPT 1.3B** was trained for another 50000 steps with batch_size=4 and context window of **2048** tokens on 1 A100.
+Final perplexity for this model on validation is **8.2**.
+_Chart of the training loss and perplexity:_
+![](https://i.imgur.com/t9v4Idk.png)
+## Other mGPT-1.3B models
+- [mGPT-1.3B-armenian](https://huggingface.co/ai-forever/mGPT-1.3B-armenian)
+- [mGPT-1.3B-azerbaijan](https://huggingface.co/ai-forever/mGPT-1.3B-azerbaijan)
+- [mGPT-1.3B-bashkir](https://huggingface.co/ai-forever/mGPT-1.3B-bashkir)
+- [mGPT-1.3B-belorussian](https://huggingface.co/ai-forever/mGPT-1.3B-belorussian)
+- [mGPT-1.3B-bulgarian](https://huggingface.co/ai-forever/mGPT-1.3B-bulgarian)
+- [mGPT-1.3B-buryat](https://huggingface.co/ai-forever/mGPT-1.3B-buryat)
+- [mGPT-1.3B-chuvash](https://huggingface.co/ai-forever/mGPT-1.3B-chuvash)
+- [mGPT-1.3B-georgian](https://huggingface.co/ai-forever/mGPT-1.3B-georgian)
+- [mGPT-1.3B-kalmyk](https://huggingface.co/ai-forever/mGPT-1.3B-kalmyk)
+- [mGPT-1.3B-kazakh](https://huggingface.co/ai-forever/mGPT-1.3B-kazakh)
+- [mGPT-1.3B-mari](https://huggingface.co/ai-forever/mGPT-1.3B-mari)
+- [mGPT-1.3B-mongol](https://huggingface.co/ai-forever/mGPT-1.3B-mongol)
+- [mGPT-1.3B-ossetian](https://huggingface.co/ai-forever/mGPT-1.3B-ossetian)
+- [mGPT-1.3B-persian](https://huggingface.co/ai-forever/mGPT-1.3B-persian)
+- [mGPT-1.3B-romanian](https://huggingface.co/ai-forever/mGPT-1.3B-romanian)
+- [mGPT-1.3B-tajik](https://huggingface.co/ai-forever/mGPT-1.3B-tajik)
+- [mGPT-1.3B-tatar](https://huggingface.co/ai-forever/mGPT-1.3B-tatar)
+- [mGPT-1.3B-turkmen](https://huggingface.co/ai-forever/mGPT-1.3B-turkmen)
+- [mGPT-1.3B-tuvan](https://huggingface.co/ai-forever/mGPT-1.3B-tuvan)
+- [mGPT-1.3B-ukranian](https://huggingface.co/ai-forever/mGPT-1.3B-ukranian)
+- [mGPT-1.3B-uzbek](https://huggingface.co/ai-forever/mGPT-1.3B-uzbek)
+- [mGPT-1.3B-yakut](https://huggingface.co/ai-forever/mGPT-1.3B-yakut)
+## Feedback
+If you'll found a bug of have additional data to train model on your language — please, give us feedback.
 Model will be improved over time. Stay tuned!