Difference in Transcription Quality Between Local Whisper Large V2 and Model Card Inference API
#51
by
nkanaka1
- opened
I've recently started using OpenAI's Whisper for transcribing audio files, specifically using the whisper.load_model("large-v2") configuration in my local environment. I expected to achieve a high level of accuracy based on the model's reported capabilities.
However, I've noticed that the transcription results I get locally are significantly worse than those I get when using the model's inference API as showcased on the model's card.