Hubert-noisy-cv-kakeiken-J_ver5

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_NOISY_KAKEIKEN_W - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0313
  • Wer: 0.9988
  • Cer: 1.0167

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
28.345 1.0 820 10.8647 1.0 1.1283
9.1788 2.0 1640 7.5464 1.0 1.1284
6.9973 3.0 2460 4.2194 1.0 1.1284
3.6678 4.0 3280 3.0360 1.0 1.1284
2.7018 5.0 4100 2.3679 1.0 1.1284
2.2269 6.0 4920 1.1460 1.0 1.1401
0.9024 7.0 5740 0.5315 0.9997 1.1109
0.4487 8.0 6560 0.2406 0.9990 1.0341
0.3431 9.0 7380 0.1571 0.9988 1.0292
0.2582 10.0 8200 0.1372 0.9990 1.0318
0.2076 11.0 9020 0.2418 0.9991 1.0406
0.1942 12.0 9840 0.0756 0.9988 1.0243
0.1828 13.0 10660 0.1198 0.9990 1.0354
0.181 14.0 11480 0.0759 0.9988 1.0250
0.1662 15.0 12300 0.0673 0.9988 1.0255
0.1611 16.0 13120 0.0702 0.9990 1.0204
0.1542 17.0 13940 0.0960 0.9990 1.0195
0.1459 18.0 14760 0.0424 0.9990 1.0207
0.1457 19.0 15580 0.0442 0.9988 1.0199
0.1289 20.0 16400 0.0677 0.9990 1.0253
0.1235 21.0 17220 0.0501 0.9990 1.0198
0.1225 22.0 18040 0.0388 0.9988 1.0188
0.1168 23.0 18860 0.0297 0.9988 1.0180
0.1121 24.0 19680 0.0316 0.9988 1.0180
0.1034 25.0 20500 0.0370 0.9988 1.0181
0.104 26.0 21320 0.0340 0.9988 1.0178
0.0912 27.0 22140 0.0299 0.9988 1.0183
0.0882 28.0 22960 0.0283 0.9988 1.0172
0.0807 29.0 23780 0.0316 0.9988 1.0172
0.0836 30.0 24600 0.0359 0.9988 1.0175
0.0806 31.0 25420 0.0343 0.9988 1.0169
0.0705 32.0 26240 0.0259 0.9988 1.0165
0.0668 33.0 27060 0.0289 0.9988 1.0169
0.0629 34.0 27880 0.0335 0.9988 1.0175
0.0641 35.0 28700 0.0337 0.9988 1.0173
0.0591 36.0 29520 0.0316 0.9988 1.0170
0.055 37.0 30340 0.0319 0.9988 1.0169
0.0554 38.0 31160 0.0319 0.9988 1.0169
0.057 39.0 31980 0.0311 0.9988 1.0167
0.0569 39.9518 32760 0.0313 0.9988 1.0167

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
128
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for utakumi/Hubert-noisy-cv-kakeiken-J_ver5

Finetuned
(52)
this model