metadata

license: apache-2.0
datasets:
  - HuggingFaceH4/ultrachat_200k
base_model:
  - Qwen/Qwen2.5-1.5B
pipeline_tag: text-generation
tags:
  - trl
  - qwen
  - sft
  - alignment
  - transformers
  - custome
  - chat

Qwen2.5-1.5B-ultrachat200k

Model Details

Model type: sft model
License: Apache license 2.0
Finetuned from model: Qwen/Qwen2.5-1.5B
Training data: HuggingFaceH4/ultrachat_200k
Training framework: trl

Training Details

Cutome training codes

Training Hyperparameters

attn_implementation: flash_attention_2
bf16: True
learning_rate: 5e-5
lr_scheduler_type: cosine
per_device_train_batch_size: 2
gradient_accumulation_steps: 16
torch_dtype: bfloat16
num_train_epochs: 1
max_seq_length: 2048
warmup_ratio: 0.1 \

Results

init_train_loss: 1.421
final_train_loss: 1.192
eval_loss: 1.2003