Difference in Transcription Quality Between Local Whisper Large V2 and Model Card Inference API

#51

by nkanaka1 - opened Mar 22, 2024

Mar 22, 2024

I've recently started using OpenAI's Whisper for transcribing audio files, specifically using the whisper.load_model("large-v2") configuration in my local environment. I expected to achieve a high level of accuracy based on the model's reported capabilities.
However, I've noticed that the transcription results I get locally are significantly worse than those I get when using the model's inference API as showcased on the model's card.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment