deepdml/faster-whisper-large-v3-turbo-ct2

Large-v3 models on GPU

Implementation	Precision	Beam size	Time	Max. GPU memory	Max. CPU memory	WER %
openai/whisper-large-v3	fp16	5	2m23s	MB	MB
openai/whisper-turbo	fp16	5	39s	MB	MB
faster-whisper	fp16	5	52.023s	4521MB	901MB	2.883
faster-whisper	int8	5	52.639s	2953MB	2261MB	4.594
faster-distil-large-v3	fp16	5	26.126s	2409MB	900MB	2.392
faster-distil-large-v3	int8	5	22.537s	1481MB	1468MB	2.392
faster-large-v3-turbo	fp16	5	19.155s	2537MB	899MB	1.919
faster-large-v3-turbo	int8	5	19.591s	1545MB	1526MB	1.919

WER on librispeech clean val split.
GPU GeForce RTX 2080 Ti 11GB

deepdml
/

faster-whisper-large-v3-turbo-ct2