Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
|
13 |
# PULI GPTrio (6.7 billion parameter)
|
14 |
|
15 |
-
For further details, see [our demo site](https://juniper.nytud.hu/demo/gptrio).
|
16 |
|
17 |
- Hungarian-English-Chinese trilingual GPT-NeoX model (6.7 billion parameter)
|
18 |
- Trained with EleutherAI's GPT-NeoX [github](https://github.com/EleutherAI/gpt-neox)
|
@@ -20,11 +20,11 @@ For further details, see [our demo site](https://juniper.nytud.hu/demo/gptrio).
|
|
20 |
|
21 |
## Dataset
|
22 |
|
23 |
-
- Hungarian: 41
|
24 |
-
- English: 61
|
25 |
-
- Github: 6
|
26 |
-
- Chinese: 98
|
27 |
-
- (12
|
28 |
|
29 |
## Limitations
|
30 |
|
|
|
12 |
|
13 |
# PULI GPTrio (6.7 billion parameter)
|
14 |
|
15 |
+
For further details and test our instruct model, see [our demo site](https://juniper.nytud.hu/demo/gptrio).
|
16 |
|
17 |
- Hungarian-English-Chinese trilingual GPT-NeoX model (6.7 billion parameter)
|
18 |
- Trained with EleutherAI's GPT-NeoX [github](https://github.com/EleutherAI/gpt-neox)
|
|
|
20 |
|
21 |
## Dataset
|
22 |
|
23 |
+
- Hungarian: 41.5 billion words (314 GB)
|
24 |
+
- English: 61.9 billion words (391 GB)
|
25 |
+
- Github: 6 million documents (33 GB)
|
26 |
+
- Chinese: 98.7 billion Chinese character (340 GB)
|
27 |
+
- (12 billion non Chinese token)
|
28 |
|
29 |
## Limitations
|
30 |
|