MagiBoss's picture
Update README.md
b6a55e6 verified
metadata
language:
  - th
base_model: openai/whisper-small
datasets:
  - mozilla-foundation/common_voice_11_0
metrics:
  - wer
  - ter
  - chrf
  - cer
  - bleu
  - suber
model-index:
  - name: Whisper Small Thai Lora - Magi Boss
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 11.0
          type: mozilla-foundation/common_voice_11_0
          config: th
          split: None
          args: 'config: th, split: validation'
        metrics:
          - name: Wer
            type: wer
            value: 1.1186
          - name: Ter
            type: ter
            value: 111.8553
          - name: ChrF
            type: chrf
            value: 66.9454
          - name: CER
            type: cer
            value: 0.2283
          - name: BLEU
            type: bleu
            value: 3.6586
          - name: SubER
            type: suber
            value: 1.1628
pipeline_tag: automatic-speech-recognition
license: apache-2.0
library_name: peft

Whisper Small Thai Lora - Magi Boss

This model is a fine-tuned version of openai/whisper-small on the Common Voice 11.0 dataset (Training Set 20000 row, Validation Set 500 row). It achieves the following results on the evaluation set:

  • Loss: 0.8313
  • WER: 1.1186
  • TER: 111.8553
  • ChrF: 66.9454
  • CER: 0.2283
  • BLEU: 3.6586
  • SubER: 1.1628

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • optimizer: AdamW
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 25
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Wer Ter Chrf Cer Bleu SubER
0.1990 0.4 250 0.8732 1.1969 119.6879 65.8239 0.2487 4.2583 1.2745
0.1902 0.8 500 0.8353 1.1232 112.3175 66.5794 0.2430 3.9823 1.1698
0.1873 1 625 0.8313 1.1186 111.8553 66.9454 0.2283 3.6586 1.1628

Framework versions

  • PEFT 0.12.1.dev0
  • Transformers 4.45.0.dev0
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1