Visualize in Weights & Biases

speech-synth-large-finetune

This model is a fine-tuned version of openai/whisper-large-v3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4259
  • Wer: 16.8396

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.1313 0.7800 250 0.4953 30.7145
0.0531 1.5585 500 0.4647 28.1055
0.0269 2.3370 750 0.4448 19.9526
0.0101 3.1154 1000 0.4392 23.0062
0.0064 3.8955 1250 0.4053 22.2947
0.0057 4.6739 1500 0.4148 19.3003
0.0044 5.4524 1750 0.4028 17.9958
0.0047 6.2309 2000 0.4125 19.0631
0.003 7.0094 2250 0.3979 17.7883
0.0038 7.7894 2500 0.3923 20.5455
0.0 8.5679 2750 0.4077 17.6401
0.0002 9.3463 3000 0.4050 17.3733
0.0009 10.1248 3250 0.4101 17.0471
0.0005 10.9048 3500 0.4227 17.1954
0.0 11.6833 3750 0.4217 17.2250
0.0002 12.4618 4000 0.4241 17.0471
0.0 13.2402 4250 0.4239 16.9582
0.0005 14.0187 4500 0.4250 16.6617
0.0 14.7988 4750 0.4254 16.8396
0.0001 15.5772 5000 0.4259 16.8396

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
22
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for neuronbit/speech-synth-large-finetune

Finetuned
(371)
this model