---
library_name: transformers
language:
- ja
license: apache-2.0
base_model: rinna/japanese-hubert-base
tags:
- automatic-speech-recognition
- original_noisy_common_voice_and_kakeiken
- generated_from_trainer
metrics:
- wer
model-index:
- name: Hubert-noisy-cv-kakeiken
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Hubert-noisy-cv-kakeiken

This model is a fine-tuned version of [rinna/japanese-hubert-base](https://huggingface.co./rinna/japanese-hubert-base) on the ORIGINAL_NOISY_COMMON_VOICE_AND_KAKEIKEN - JA dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9441
- Wer: 1.0
- Cer: 0.3276

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 12500
- num_epochs: 30.0
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step   | Validation Loss | Wer    | Cer    |
|:-------------:|:-------:|:------:|:---------------:|:------:|:------:|
| 0.3007        | 1.0     | 3463   | 0.9433          | 1.0    | 0.3277 |
| 0.1409        | 2.0     | 6926   | 1.0068          | 1.0    | 0.3606 |
| 0.1444        | 3.0     | 10389  | 1.0954          | 1.0    | 0.3839 |
| 0.1518        | 4.0     | 13852  | 1.2021          | 1.0016 | 0.4125 |
| 0.1691        | 5.0     | 17315  | 1.3227          | 1.0224 | 0.4465 |
| 0.1612        | 6.0     | 20778  | 1.2268          | 1.0087 | 0.4165 |
| 0.155         | 7.0     | 24241  | 1.3089          | 1.0160 | 0.4389 |
| 0.1529        | 8.0     | 27704  | 1.2341          | 1.0017 | 0.4234 |
| 0.1458        | 9.0     | 31167  | 1.2319          | 1.0095 | 0.4250 |
| 0.1371        | 10.0    | 34630  | 1.1689          | 1.0041 | 0.4131 |
| 0.1295        | 11.0    | 38093  | 1.2024          | 1.0278 | 0.4175 |
| 0.1347        | 12.0    | 41556  | 1.2089          | 1.0142 | 0.4192 |
| 0.1161        | 13.0    | 45019  | 1.1461          | 1.0371 | 0.3998 |
| 0.1162        | 14.0    | 48482  | 1.1236          | 1.0311 | 0.3920 |
| 0.1107        | 15.0    | 51945  | 1.0697          | 1.0276 | 0.3797 |
| 0.1029        | 16.0    | 55408  | 1.0551          | 1.0108 | 0.3806 |
| 0.0992        | 17.0    | 58871  | 1.0634          | 1.0187 | 0.3727 |
| 0.0906        | 18.0    | 62334  | 1.0299          | 1.0273 | 0.3657 |
| 0.0793        | 19.0    | 65797  | 1.0217          | 1.0149 | 0.3602 |
| 0.0769        | 20.0    | 69260  | 1.0025          | 1.0334 | 0.3533 |
| 0.0727        | 21.0    | 72723  | 1.0101          | 1.0386 | 0.3510 |
| 0.0654        | 22.0    | 76186  | 1.0316          | 1.0345 | 0.3494 |
| 0.0605        | 23.0    | 79649  | 1.0584          | 1.0254 | 0.3438 |
| 0.0566        | 24.0    | 83112  | 1.0380          | 1.0479 | 0.3431 |
| 0.0507        | 25.0    | 86575  | 1.0691          | 1.0427 | 0.3413 |
| 0.0498        | 26.0    | 90038  | 1.1261          | 1.0399 | 0.3407 |
| 0.0444        | 27.0    | 93501  | 1.1671          | 1.0578 | 0.3417 |
| 0.0444        | 28.0    | 96964  | 1.1998          | 1.0621 | 0.3414 |
| 0.0439        | 29.0    | 100427 | 1.1988          | 1.0568 | 0.3406 |
| 0.0441        | 29.9915 | 103860 | 1.2041          | 1.0594 | 0.3410 |


### Framework versions

- Transformers 4.47.0.dev0
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.20.3