Update README.md
Browse files
README.md
CHANGED
@@ -31,3 +31,6 @@ model = AutoModelForCausalLM.from_pretrained("nhanv/vi-gpt2")
|
|
31 |
|
32 |
# Model architecture
|
33 |
A 12-layer, 768-hidden-size transformer-based language model.
|
|
|
|
|
|
|
|
31 |
|
32 |
# Model architecture
|
33 |
A 12-layer, 768-hidden-size transformer-based language model.
|
34 |
+
|
35 |
+
# Training
|
36 |
+
The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
|