NlpHUST
/

gpt2-vietnamese

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nhanv commited on May 30, 2022

Commit

b3416f2

·

1 Parent(s): b8b43df

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -34,3 +34,22 @@ A 12-layer, 768-hidden-size transformer-based language model.
 # Training
 The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.

 # Training
 The model was trained on Vietnamese Oscar dataset (32 GB) to optimize a traditional language modelling objective on v3-8 TPU for around 6 days. It reaches around 13.4 perplexity on a chosen validation set from Oscar.
+### GPT-2 Fineturning
+The following example fine-tunes GPT-2 on WikiText-2. We're using the raw WikiText-2 (no tokens were replaced before
+the tokenization). The loss here is that of causal language modeling.
+The script [here](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) .
+```bash
+python run_clm.py \
+    --model_name_or_path NlpHUST/gpt2-vietnamese \
+    --dataset_name wikitext \
+    --dataset_config_name wikitext-2-raw-v1 \
+    --per_device_train_batch_size 8 \
+    --per_device_eval_batch_size 8 \
+    --do_train \
+    --do_eval \
+    --output_dir /tmp/test-clm
+```