Edit model card

Built with Axolotl

See axolotl config

axolotl version: 0.4.0

base_model: T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0
base_model_config: T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
is_llama_derived_model: true
hub_model_id: T3Q-LLM-sft1.0-dpo1.0_4300QA

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  # - path: admin_data.csv
    - path: superiort/multiplechoice-4300
      type: alpaca
      # The below are defaults. only set what's needed if you use a different column name.
      # system_prompt: ""
      # system_format: "{system}"
      # field_system: system
      # field_instruction: instruction
      # field_input: input
      # field_output: output

      # format: |-
      #   Human: {instruction} {input}
      #   Assistant:

      # no_input_format: "{instruction} "

# dataset_prepared_path: yanolja_preprocessed_data
dataset_prepared_path: last_run_prepared
val_set_size: 0.2
output_dir: ./T3Q-LLM-sft1.0-dpo1.0_4300QA

adapter: qlora
lora_model_dir: 

# device_map: [0,1,3]

sequence_len: 4096
sample_packing: false

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out: 

wandb_project: axolotl_T3Q_4300
wandb_entity: 
wandb_watch: 
wandb_run_id: T3Q_mod_4300
wandb_log_model: 

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 10
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience: 
resume_from_checkpoint: 
local_rank: 
logging_steps: 1
xformers_attention: 
flash_attention: true

warmup_steps: 100
eval_steps: 0.01
save_strategy: epoch
save_steps: 
debug: 
deepspeed: 
weight_decay: 0.0
fsdp: 
fsdp_config: 
special_tokens:
    bos_token: "<s>"
    eos_token: "<|im_end|>"
    unk_token: "<unk>"
    pad_token: "</s>"  # EOS와 PAD가 동일

T3Q-LLM-sft1.0-dpo1.0_4300QA

This model is a fine-tuned version of T3Q-LLM/T3Q-LLM-sft1.0-dpo1.0 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2288

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
1.2424 0.0093 1 1.0432
1.0333 0.1023 11 0.9004
0.8715 0.2047 22 0.7157
0.7053 0.3070 33 0.6548
0.6688 0.4093 44 0.6449
0.6823 0.5116 55 0.6282
0.5876 0.6140 66 0.6251
0.6994 0.7163 77 0.6290
0.6662 0.8186 88 0.6311
0.6239 0.9209 99 0.6338
0.5959 1.0233 110 0.6319
0.6408 1.1256 121 0.6668
0.595 1.2279 132 0.6221
0.5476 1.3302 143 0.6295
0.587 1.4326 154 0.6569
0.5867 1.5349 165 0.6208
0.5895 1.6372 176 0.6264
0.6581 1.7395 187 0.6208
0.5872 1.8419 198 0.6290
0.6314 1.9442 209 0.6243
0.4397 2.0465 220 0.6591
0.4568 2.1488 231 0.7095
0.422 2.2512 242 0.6914
0.453 2.3535 253 0.7001
0.4678 2.4558 264 0.6896
0.4335 2.5581 275 0.6776
0.4796 2.6605 286 0.6829
0.4637 2.7628 297 0.6742
0.4532 2.8651 308 0.6828
0.4348 2.9674 319 0.6836
0.2787 3.0698 330 0.8085
0.2336 3.1721 341 0.8380
0.2341 3.2744 352 0.7998
0.2393 3.3767 363 0.8041
0.2826 3.4791 374 0.8040
0.2505 3.5814 385 0.8099
0.3057 3.6837 396 0.8103
0.2789 3.7860 407 0.7964
0.269 3.8884 418 0.7891
0.2493 3.9907 429 0.7958
0.1193 4.0930 440 0.9242
0.1143 4.1953 451 0.9331
0.1147 4.2977 462 0.9112
0.1351 4.4 473 0.9290
0.0982 4.5023 484 0.9358
0.1011 4.6047 495 0.9279
0.09 4.7070 506 0.9289
0.1063 4.8093 517 0.9392
0.1038 4.9116 528 0.9267
0.0361 5.0140 539 0.9412
0.0371 5.1163 550 1.0589
0.033 5.2186 561 1.0253
0.0426 5.3209 572 1.0482
0.0357 5.4233 583 1.0388
0.0355 5.5256 594 1.0566
0.0373 5.6279 605 1.0470
0.0395 5.7302 616 1.0581
0.0366 5.8326 627 1.0696
0.0387 5.9349 638 1.0641
0.0127 6.0372 649 1.0692
0.0114 6.1395 660 1.1612
0.0105 6.2419 671 1.1575
0.0121 6.3442 682 1.1479
0.0082 6.4465 693 1.1591
0.011 6.5488 704 1.1669
0.0112 6.6512 715 1.1645
0.0109 6.7535 726 1.1628
0.0102 6.8558 737 1.1705
0.0098 6.9581 748 1.1769
0.006 7.0605 759 1.1840
0.0064 7.1628 770 1.2016
0.0063 7.2651 781 1.2133
0.0058 7.3674 792 1.2182
0.0056 7.4698 803 1.2218
0.0057 7.5721 814 1.2234
0.0059 7.6744 825 1.2245
0.0057 7.7767 836 1.2247
0.0048 7.8791 847 1.2247
0.0054 7.9814 858 1.2246
0.0051 8.0837 869 1.2252
0.0059 8.1860 880 1.2261
0.0053 8.2884 891 1.2272
0.0057 8.3907 902 1.2275
0.0056 8.4930 913 1.2280
0.0052 8.5953 924 1.2283
0.007 8.6977 935 1.2287
0.0052 8.8 946 1.2285
0.005 8.9023 957 1.2289
0.0056 9.0047 968 1.2288
0.005 9.1070 979 1.2289
0.0054 9.2093 990 1.2290
0.0053 9.3116 1001 1.2288
0.0049 9.4140 1012 1.2290
0.0052 9.5163 1023 1.2290
0.0058 9.6186 1034 1.2291
0.0059 9.7209 1045 1.2289
0.0055 9.8233 1056 1.2289
0.0054 9.9256 1067 1.2288

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.1.2+cu121
  • Datasets 2.15.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
10.8B params
Tensor type
FP16
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for superiort/T3Q-LLM-sft1.0-dpo1.0_4300QA_10epochs

Adapter
(2)
this model