techiaith
/

whisper-large-v3-ft-cy-en

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

whisper-large-v3-ft-cy-en

This model is a version of openai/whisper-large-v3 fine-tuned with a curated collection of Welsh and English speech data (see: techiaith/commonvoice_18_0_cy_en collected originally from Mozilla's Common Voice project.

It achieves the following results on the following language specific test sets:

WER (test_en): 13.85
WER (test_cy): 8.78
WER (test_cy+test_en): 9.55

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2097	0.2497	1000	0.2169	14.2221
0.1621	0.4993	2000	0.1816	11.6845
0.1406	0.7490	3000	0.1609	10.2445
0.1242	0.9987	4000	0.1505	9.5594

Framework versions

Transformers 4.41.2
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 17

Safetensors

Model size

1.54B params

Tensor type

F32

·

Inference Examples

Automatic Speech Recognition

Inference API (serverless) is not available, repository is disabled.

Model tree for techiaith/whisper-large-v3-ft-cy-en

Base model

openai/whisper-large-v3

Finetuned

this model

Dataset used to train techiaith/whisper-large-v3-ft-cy-en

Collection including techiaith/whisper-large-v3-ft-cy-en

Speech Recognition

9 items • Updated Aug 9