CheeLi03's picture
Upload tokenizer
9cd3f10 verified
metadata
base_model: openai/whisper-tiny
datasets:
  - fleurs
language:
  - it
license: apache-2.0
metrics:
  - wer
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
model-index:
  - name: Whisper Tiny Italian 5k - Chee Li
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Google Fleurs
          type: fleurs
          config: it_it
          split: None
          args: 'config: it split: test'
        metrics:
          - type: wer
            value: 50.93909245328804
            name: Wer

Whisper Tiny Italian 5k - Chee Li

This model is a fine-tuned version of openai/whisper-tiny on the Google Fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6896
  • Wer: 50.9391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.195 4.6729 1000 0.4900 53.7054
0.0248 9.3458 2000 0.5791 61.4365
0.0076 14.0187 3000 0.6469 54.1907
0.0044 18.6916 4000 0.6788 51.7641
0.0036 23.3645 5000 0.6896 50.9391

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1