|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- TigerResearch/pretrain_zh |
|
language: |
|
- zh |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
based [gpt2](https://huggingface.co./gpt2) |
|
|
|
trained with [LLaMA-Efficient-Tuning](https://github.com/hiyouga/LLaMA-Efficient-Tuning) |
|
|
|
```sh |
|
python src/train_bash.py \ |
|
--stage pt \ |
|
--model_name_or_path gpt2 \ |
|
--do_train \ |
|
--finetuning_type full \ |
|
--dataset tiger_pretrain_zh \ |
|
--template ziya \ |
|
--use_fast_tokenizer \ |
|
--preprocessing_num_workers 64 \ |
|
--per_device_train_batch_size 8 \ |
|
--gradient_accumulation_steps 16 \ |
|
--output_dir .cache/gpt2-tigerResearch_pretrain_zh \ |
|
--lr_scheduler_type cosine \ |
|
--logging_steps 10 \ |
|
--save_steps 1000 \ |
|
--eval_steps 500 \ |
|
--learning_rate 5e-5 \ |
|
--num_train_epochs 1.0 \ |
|
--val_size 3000 \ |
|
--evaluation_strategy steps \ |
|
--plot_loss \ |
|
--max_source_length 1024 \ |
|
--bf16 |
|
``` |
|
|