Yi-1.5-6B-sft-241208

This model is a fine-tuned version of [saves/Yi-1.5-6B_pt_241207] on the chinese-medical-dialogue, the CMB, the cMedQA2, the CMExam, the CMtMedQA, the COIG-CQIA-full, the COIG_full, the HuatuoGPT_sft_data_v, the huatuo_encyclopedia_q, the huatuo_lite, the imcs21, the Med-single-choice, the Medical_dialogue_system_en_single_turn, the qizhengpt-sft-20, the self_cognition, the sharegpt_zh_38K_format, the shennong, the shibing642-medica, the tigerbot_sft_data, the xywy-KG and the zhongyi-zhiku datasets. It achieves the following results on the evaluation set:

Loss: 1.4955

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-06
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
num_epochs: 2.0

Training results

Training Loss	Epoch	Step	Validation Loss
1.6946	0.1277	1000	1.6502
1.6006	0.2554	2000	1.6065
1.5703	0.3830	3000	1.5798
1.6069	0.5107	4000	1.5604
1.5473	0.6384	5000	1.5453
1.5206	0.7661	6000	1.5329
1.4961	0.8938	7000	1.5222
1.4639	1.0215	8000	1.5162
1.4879	1.1491	9000	1.5104
1.4931	1.2768	10000	1.5055
1.503	1.4045	11000	1.5014
1.4826	1.5322	12000	1.4985
1.4544	1.6599	13000	1.4966
1.4557	1.7875	14000	1.4958
1.4839	1.9152	15000	1.4955

Framework versions

Transformers 4.44.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.19.1

SYSU-MUCFC-FinTech-Research-Center
/

Zhongsi-6B-Instruct

Yi-1.5-6B-sft-241208

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for SYSU-MUCFC-FinTech-Research-Center/Zhongsi-6B-Instruct

Evaluation results