--- license: apache-2.0 tags: - generated_from_trainer - dutch - whisper-event metrics: - wer base_model: qmeeus/whisper-small-nl model-index: - name: whisper-small-nl results: [] --- # whisper-small-nl This model is a fine-tuned version of [qmeeus/whisper-small-nl](https://huggingface.co./qmeeus/whisper-small-nl) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.3034 - Wer: 14.5354 ## Model description More information needed ## Intended uses & limitations Transcribe files in Dutch: ```python import soundfile as sf from transformers import pipeline whisper_asr = pipeline("automatic-speech-recognition", model="qmeeus/whisper-small-nl", device=0) whisper_asr.model.config.forced_decoder_ids = whisper_asr.tokenizer.get_decoder_prompt_ids( task="transcribe", language="nl" ) waveform, sr = sf.read(filename) def iter_chunks(waveform, sampling_rate=16_000, chunk_length=30.): assert sampling_rate == 16_000 n_frames = math.floor(sampling_rate * chunk_length) for start in range(0, len(waveform), n_frames): end = min(len(waveform), start + n_frames) yield waveform[start:end] for sentence in whisper_asr(iter_chunks(waveform, sr), max_new_tokens=448): print(sentence["text"]) ``` ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 16 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - training_steps: 10000 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:-----:|:-----:|:---------------:|:-------:| | 0.2045 | 2.49 | 1000 | 0.3194 | 16.1628 | | 0.0652 | 4.97 | 2000 | 0.3425 | 16.3672 | | 0.0167 | 7.46 | 3000 | 0.3915 | 15.8187 | | 0.0064 | 9.95 | 4000 | 0.4190 | 15.7298 | | 0.1966 | 2.02 | 5000 | 0.3298 | 15.0881 | | 0.1912 | 4.04 | 6000 | 0.3266 | 14.8764 | | 0.1008 | 7.02 | 7000 | 0.3261 | 14.8086 | | 0.0899 | 9.04 | 8000 | 0.3196 | 14.6487 | | 0.1126 | 12.02 | 9000 | 0.3283 | 14.5894 | | 0.1071 | 14.04 | 10000 | 0.3034 | 14.5354 | ### Framework versions - Transformers 4.26.0.dev0 - Pytorch 1.13.0+cu117 - Datasets 2.7.1.dev0 - Tokenizers 0.13.2