metadata
license: apache-2.0
datasets:
- HuggingFaceH4/ultrachat_200k
base_model:
- Qwen/Qwen2.5-1.5B
pipeline_tag: text-generation
tags:
- trl
- qwen
- sft
- alignment
- transformers
- custome
- chat
Qwen2.5-1.5B-ultrachat200k
Model Details
- Model type: sft model
- License: Apache license 2.0
- Finetuned from model: Qwen/Qwen2.5-1.5B
- Training data: HuggingFaceH4/ultrachat_200k
- Training framework: trl
Training Details
Cutome training codes
Training Hyperparameters
attn_implementation
: flash_attention_2 bf16
: True learning_rate
: 5e-5 lr_scheduler_type
: cosine per_device_train_batch_size
: 2 gradient_accumulation_steps
: 16 torch_dtype
: bfloat16 num_train_epochs
: 1 max_seq_length
: 2048 warmup_ratio
: 0.1 \
Results
init_train_loss
: 1.421 final_train_loss
: 1.192 eval_loss
: 1.2003