speech-synth-large-finetune

This model is a fine-tuned version of openai/whisper-large-v3 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 5000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
0.1313	0.7800	250	0.4953	30.7145
0.0531	1.5585	500	0.4647	28.1055
0.0269	2.3370	750	0.4448	19.9526
0.0101	3.1154	1000	0.4392	23.0062
0.0064	3.8955	1250	0.4053	22.2947
0.0057	4.6739	1500	0.4148	19.3003
0.0044	5.4524	1750	0.4028	17.9958
0.0047	6.2309	2000	0.4125	19.0631
0.003	7.0094	2250	0.3979	17.7883
0.0038	7.7894	2500	0.3923	20.5455
0.0	8.5679	2750	0.4077	17.6401
0.0002	9.3463	3000	0.4050	17.3733
0.0009	10.1248	3250	0.4101	17.0471
0.0005	10.9048	3500	0.4227	17.1954
0.0	11.6833	3750	0.4217	17.2250
0.0002	12.4618	4000	0.4241	17.0471
0.0	13.2402	4250	0.4239	16.9582
0.0005	14.0187	4500	0.4250	16.6617
0.0	14.7988	4750	0.4254	16.8396
0.0001	15.5772	5000	0.4259	16.8396