Whisper Base Pashto - Augmented

This model is a fine-tuned version of openai/whisper-base on the google/fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8723
  • Wer: 57.6120
  • Cer: 26.6468

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.9708 2.38 100 0.8821 64.0133 27.3253
0.7477 4.75 200 0.8062 59.9576 26.4079
0.6229 7.14 300 0.7855 58.3081 26.3193
0.4833 9.52 400 0.7870 57.5288 24.8855
0.4084 11.89 500 0.7980 56.5224 25.2214
0.3323 14.28 600 0.8201 56.6662 25.3317
0.283 16.66 700 0.8406 57.7406 26.8674
0.2598 19.05 800 0.8538 57.2866 26.0386
0.2235 21.42 900 0.8697 58.2703 26.6819
0.2202 23.8 1000 0.8723 57.6120 26.6468

Framework versions

  • Transformers 4.26.0.dev0
  • Pytorch 1.13.1+cu116
  • Datasets 2.8.1.dev0
  • Tokenizers 0.13.2
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ihanif/content

Evaluation results