KoModernBERT-chp-06

This model is a fine-tuned version of CocoRoF/KoModernBERT-chp-05 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 8
total_train_batch_size: 512
total_eval_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1.0

Training Loss	Epoch	Step	Validation Loss
17.0359	0.0542	5000	2.0982
16.6082	0.1085	10000	2.0874
16.3776	0.1627	15000	2.0795
16.3601	0.2170	20000	2.0687
16.4271	0.2712	25000	2.0609
16.1757	0.3255	30000	2.0522
16.4832	0.3797	35000	2.0439
16.243	0.4340	40000	2.0324
16.1772	0.4882	45000	2.0268
16.2524	0.5424	50000	2.0190
16.2925	0.5967	55000	2.0131
16.1287	0.6509	60000	1.9996
15.9769	0.7052	65000	1.9977
15.7585	0.7594	70000	1.9885
16.0227	0.8137	75000	1.9840
15.4506	0.8679	80000	1.9777
15.7857	0.9222	85000	1.9699
15.3766	0.9764	90000	1.9652