chlee10
/

T3Q-platypus-SOLAR-10.7B-v1.0

Text Generation

SOLAR-10.7B-v1.0

Open-platypus-Commercial

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chlee10 commited on Mar 7

Commit

9a088d1

•

1 Parent(s): 7f72575

Update README.md

Files changed (1) hide show

README.md +27 -20

README.md CHANGED Viewed

@@ -26,26 +26,33 @@ This model is a fine-tuned version of upstage/SOLAR-10.7B-v1.0
 The following hyperparameters were used during training:
 ```python
-python finetune.py \
-    --base_model PracticeLLM/Twice-KoSOLAR-16.1B-test \
-    --data-path  kyujinpy/KOR-OpenOrca-Platypus-v3 \
-    --output_dir ./Twice-KoSOLAR-16.1B-instruct-test \
-    --batch_size 64 \
-    --micro_batch_size 1 \
-    --num_epochs 1 \
-    --learning_rate 3e-5 \
-    --cutoff_len 4096 \
-    --val_set_size 0 \
-    --lora_r 16 \
-    --lora_alpha 16 \
-    --lora_dropout 0.05 \
-    --lora_target_modules '[q_proj, k_proj, v_proj, o_proj, gate_proj, down_proj, up_proj, lm_head]' \
-    --train_on_inputs False \
-    --add_eos_token False \
-    --group_by_length False \
-    --prompt_template_name user_prompt \
-    --lr_scheduler 'cosine' \
-    #--warmup_steps 100 \
 ```
 ## Framework versions

 The following hyperparameters were used during training:
 ```python
+  # 데이터셋과 훈련 횟수와 관련된 하이퍼 파라미터
+  batch_size = 16
+  num_epochs = 1
+  micro_batch = 1
+  gradient_accumulation_steps = batch_size // micro_batch
+  # 훈련 방법에 대한 하이퍼 파라미터
+  cutoff_len = 4096
+  lr_scheduler = 'cosine'
+  warmup_ratio = 0.06 # warmup_steps = 100
+  learning_rate = 4e-4
+  optimizer = 'adamw_torch'
+  weight_decay = 0.01
+  max_grad_norm = 1.0
+  # LoRA config
+  lora_r = 16
+  lora_alpha = 16
+  lora_dropout = 0.05
+  lora_target_modules = ["gate_proj", "down_proj", "up_proj"]
+  # Tokenizer에서 나오는 input값 설정 옵션
+  train_on_inputs = False
+  add_eos_token = False
+  # NEFTune params
+  noise_alpha: int = 5
 ```
 ## Framework versions