Safetensors
whisper

Malaysian Finetune Whisper Large V3 Turbo

Finetune Whisper Large V3 Turbo on Malaysian context.

Improvement

  1. Distilled from Whisper Large V3 on Malaysian and Science context.
  2. Better translation for Malay, Manglish, Mandarin, Tamil and Science context.
  3. Word level timestamp, introduced <|transcribeprecise|> token, a new task!

how we finetuned it?

We done 2 phases,

  1. Finetune on mesolitica/Malaysian-STT-Whisper
  1. Annealing on 5% from mesolitica/Malaysian-STT-Whisper and 100% from malaysia-ai/STT-Whisper, still on training
Downloads last month
89
Safetensors
Model size
809M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for malaysia-ai/malaysian-whisper-large-v3-turbo

Finetuned
(184)
this model

Datasets used to train malaysia-ai/malaysian-whisper-large-v3-turbo