shi3z
/

RWKV-LM-LoRA-Alpaca-Cleaned-Japan

Model card Files Files and versions Community

shi3z commited on Apr 26, 2023

Commit

60e947a

•

1 Parent(s): d34f030

Update README.md

Files changed (1) hide show

README.md +26 -0

README.md CHANGED Viewed

@@ -1,3 +1,29 @@
 ---
 license: cc-by-nc-4.0
 ---

 ---
 license: cc-by-nc-4.0
+language:
+- ja
 ---
+BaseModel: RWKV-4-Pile-14B-Instruct-test5-20230329-ctx4096.pth
+https://huggingface.co/BlinkDL/rwkv-4-pile-7b/tree/main
+Training Code:
+https://github.com/Blealtan/RWKV-LM-LoRA
+Training Prompt:
+python train.py \
+  --load_model RWKV-4-Pile-14B-Instruct-test5-20230329-ctx4096.pth \
+  --proj_dir out \
+  --data_file "train.npy" \
+  --data_type "numpy" \
+  --vocab_size 50277 \
+  --ctx_len 1024 \
+  --epoch_save 5 \
+  --epoch_count 100 \
+  --n_layer 40 \
+  --n_embd 5120 \
+  --epoch_steps 1000 --epoch_begin 0  --micro_bsz 1 --pre_ffn 0 --head_qk 0 --lr_init 1e-5 --lr_final 1e-5 --warmup_steps 0 --beta1 0.9 --beta2 0.999 --adam_eps 1e-8 --accelerator gpu --devices 1 --precision bf16 --strategy deepspeed_stage_2 --grad_cp 0 \
+  --lora --lora_r 8 --lora_alpha 32 --lora_dropout 0.01
+Training Environment:
+A6000x1 + XEON 256GB RAM