Edit model card

Distil-Whisper-Large-v3 for Brazilian Portuguese

This model is a fine-tuned version of distil-whisper-large-v3 for automatic speech recognition (ASR) in Brazilian Portuguese. It was trained using the Common Voice 16 dataset in conjunction with a private dataset transcribed using Whisper Large v3.

Model Description

The model aims to perform automatic speech transcription in Brazilian Portuguese with high accuracy. By combining data from Common Voice 16 with an automatically transcribed private dataset, the model achieved a Word Error Rate (WER) of 8.93% on the validation set of Common Voice 16.

  • Model type: Speech recognition model based on distil-whisper-large-v3
  • Language(s) (NLP): Brazilian Portuguese (pt-BR)
  • License: MIT
  • Finetuned from model [optional]: distil-whisper/distil-large-v3

How to Get Started with the Model

You can use the model with the Transformers library: from transformers import WhisperForConditionalGeneration, WhisperProcessor

from datasets import load_dataset
from transformers import WhisperProcessor, WhisperForConditionalGeneration

# Load the validation split of the Common Voice dataset for Portuguese
common_voice = load_dataset("mozilla-foundation/common_voice_11_0", "pt", split="validation")

# Load the pretrained model and processor
processor = WhisperProcessor.from_pretrained("freds0/distil-whisper-large-v3-ptbr")
model = WhisperForConditionalGeneration.from_pretrained("freds0/distil-whisper-large-v3-ptbr")

# Select a sample from the dataset
sample = common_voice[0]  # You can change the index to select a different sample

# Get the audio array and sampling rate
audio_input = sample["audio"]["array"]
sampling_rate = sample["audio"]["sampling_rate"]

# Preprocess the audio
input_features = processor(audio_input, sampling_rate=sampling_rate, return_tensors="pt").input_features

# Generate transcription
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print("Transcription:", transcription[0])
Downloads last month
171
Safetensors
Model size
756M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for freds0/distil-whisper-large-v3-ptbr

Finetuned
(8)
this model