File size: 3,575 Bytes

---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- audiofolder
metrics:
- wer
base_model: rinna/japanese-hubert-base
model-index:
- name: hubert-japanese-base-noise-0426
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: audiofolder
      type: audiofolder
      config: default
      split: None
      args: default
    metrics:
    - type: wer
      value: 0.992
      name: Wer
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# hubert-japanese-base-noise-0426

This model is a fine-tuned version of [rinna/japanese-hubert-base](https://huggingface.co./rinna/japanese-hubert-base) on the audiofolder dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2302
- Cer: 0.0598
- Wer: 0.992

## Model description

This model is a hiragana recognition model created by the proposed method.  
The model is based on rinna's hubert base model.

## Intended uses & limitations

More information needed

## Training and evaluation data

Train : noisepaused_JNAS_train_0408\
Test : noisepaused_JNAS_test_0408

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 12500.0
- num_epochs: 25

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Cer    | Wer   |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:-----:|
| 11.9556       | 1.0   | 2500  | 9.5354          | 0.9998 | 1.0   |
| 3.8038        | 2.0   | 5000  | 3.6912          | 0.9998 | 1.0   |
| 1.668         | 3.0   | 7500  | 1.1310          | 0.2733 | 1.0   |
| 0.688         | 4.0   | 10000 | 0.4272          | 0.1880 | 1.0   |
| 0.4959        | 5.0   | 12500 | 0.3254          | 0.1356 | 0.998 |
| 0.4275        | 6.0   | 15000 | 0.2856          | 0.1026 | 1.0   |
| 0.3647        | 7.0   | 17500 | 0.2720          | 0.0884 | 0.998 |
| 0.346         | 8.0   | 20000 | 0.2625          | 0.0848 | 0.998 |
| 0.3273        | 9.0   | 22500 | 0.2646          | 0.0896 | 0.996 |
| 0.301         | 10.0  | 25000 | 0.2479          | 0.0734 | 0.996 |
| 0.2871        | 11.0  | 27500 | 0.2466          | 0.0778 | 0.998 |
| 0.268         | 12.0  | 30000 | 0.2403          | 0.0717 | 0.992 |
| 0.2494        | 13.0  | 32500 | 0.2467          | 0.0705 | 0.994 |
| 0.2336        | 14.0  | 35000 | 0.2411          | 0.0702 | 0.994 |
| 0.2347        | 15.0  | 37500 | 0.2352          | 0.0662 | 0.994 |
| 0.2261        | 16.0  | 40000 | 0.2400          | 0.0708 | 0.996 |
| 0.207         | 17.0  | 42500 | 0.2341          | 0.0652 | 0.996 |
| 0.2018        | 18.0  | 45000 | 0.2340          | 0.0635 | 0.994 |
| 0.196         | 19.0  | 47500 | 0.2323          | 0.0578 | 0.992 |
| 0.1856        | 20.0  | 50000 | 0.2343          | 0.0625 | 0.992 |
| 0.1788        | 21.0  | 52500 | 0.2303          | 0.0597 | 0.992 |
| 0.1821        | 22.0  | 55000 | 0.2285          | 0.0596 | 0.99  |
| 0.1824        | 23.0  | 57500 | 0.2305          | 0.0591 | 0.99  |
| 0.1693        | 24.0  | 60000 | 0.2297          | 0.0598 | 0.99  |
| 0.1807        | 25.0  | 62500 | 0.2302          | 0.0598 | 0.992 |


### Framework versions

- Transformers 4.39.3
- Pytorch 2.2.2
- Datasets 2.18.0
- Tokenizers 0.15.1