PhoWhisper-large-ct2
This repository contains the PhoWhisper-large model converted to use CTranslate2 for faster inference. This allows for significant performance improvements, especially on CPU.
Usage
Installation: Ensure you have the necessary libraries installed:
pip install transformers ctranslate2 faster-whisper
Conversion (only needed once): This step converts the original Hugging Face model to the CTranslate2 format.
ct2-transformers-converter --model vinai/PhoWhisper-large --output_dir PhoWhisper-large-ct2 --copy_files tokenizer_config.json --quantization float16
Transcription:
import os from faster_whisper import WhisperModel model_size = "kiendt/PhoWhisper-large-ct2" # Run on GPU with FP16 #model = WhisperModel(model_size, device="cuda", compute_type="float16") # or run on GPU with INT8 # model = WhisperModel(model_size, device="cuda", compute_type="int8_float16") # or run on CPU with INT8 model = WhisperModel(model_size, device="cpu", compute_type="int8") segments, info = model.transcribe("audio.wav", beam_size=5) # Replace audio.wav with your audio file print("Detected language '%s' with probability %f" % (info.language, info.language_probability)) for segment in segments: print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
Model Details
- Based on the
vinai/PhoWhisper-large
model. - Converted using
ct2-transformers-converter
. - Optimized for faster inference with CTranslate2.
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
License
MIT
- Downloads last month
- 5
Unable to determine this model's library. Check the
docs
.
Model tree for kiendt/PhoWhisper-large-ct2
Base model
vinai/PhoWhisper-large