KoModernBERT-chp-03 / README.md
CocoRoF's picture
Training in progress, step 5000
3f1c2e7 verified
|
raw
history blame
2.13 kB
metadata
library_name: transformers
license: apache-2.0
base_model: CocoRoF/KoModernBERT-chp-01
tags:
  - generated_from_trainer
model-index:
  - name: KoModernBERT-chp-02
    results: []

KoModernBERT-chp-02

This model is a fine-tuned version of CocoRoF/KoModernBERT-chp-01 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1608

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • total_eval_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
17.4659 0.0904 5000 2.2019
17.5275 0.1808 10000 2.1976
17.7839 0.2712 15000 2.1957
17.1639 0.3616 20000 2.1879
17.4977 0.4520 25000 2.1829
17.001 0.5424 30000 2.1814
17.4017 0.6329 35000 2.1747
17.0944 0.7233 40000 2.1722
17.46 0.8137 45000 2.1676
17.0978 0.9041 50000 2.1674
17.1695 0.9945 55000 2.1608

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0