catherinearnett
commited on
Commit
•
8cdcdac
1
Parent(s):
0915eb1
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ library_name: transformers
|
|
12 |
|
13 |
# B-GPT_es_en_simultaneous
|
14 |
|
15 |
-
This is a bilingual GPT-2 style model. For the first half of training, this model was trained only on Spanish data. In the second half of training, the model was trained on a 50%-50% mix of Spanish and English data. At the end of training, 75
|
16 |
|
17 |
## Model details:
|
18 |
|
|
|
12 |
|
13 |
# B-GPT_es_en_simultaneous
|
14 |
|
15 |
+
This is a bilingual GPT-2 style model. For the first half of training, this model was trained only on Spanish data. In the second half of training, the model was trained on a 50%-50% mix of Spanish and English data. At the end of training, 75% of training data seen by the model is Spanish and 25% is English. The tokenizer was trained on the same overall proportions of data as the language model at the final step.
|
16 |
|
17 |
## Model details:
|
18 |
|