kiendt
/

PhoWhisper-large-ct2

Automatic Speech Recognition

Model card Files Files and versions Community

kiendt commited on Dec 6, 2024

Commit

84081bc

·

verified ·

1 Parent(s): f1f03d0

Update README.md

Files changed (1) hide show

README.md +58 -3

README.md CHANGED Viewed

@@ -1,3 +1,58 @@
----
-license: mit
----

+# PhoWhisper-large-ct2
+This repository contains the PhoWhisper-large model converted to use CTranslate2 for faster inference.  This allows for significant performance improvements, especially on CPU.
+## Usage
+1. **Installation:**
+   Ensure you have the necessary libraries installed:
+   ```bash
+   pip install transformers ctranslate2 faster-whisper
+   ```
+2. **Conversion (only needed once):**
+    This step converts the original Hugging Face model to the CTranslate2 format.
+   ```bash
+   ct2-transformers-converter --model vinai/PhoWhisper-large --output_dir PhoWhisper-large-ct2 --copy_files tokenizer_config.json --quantization float16
+   ```
+3. **Transcription:**
+    ```python
+    import os
+    from faster_whisper import WhisperModel
+    model_size = "kiendt/PhoWhisper-large-ct2"
+    # Run on GPU with FP16
+    model = WhisperModel(model_size, device="cuda", compute_type="float16")
+    # or run on GPU with INT8
+    # model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
+    # or run on CPU with INT8
+    model = WhisperModel(model_size, device="cpu", compute_type="int8")
+    segments, info = model.transcribe("audio.wav", beam_size=5) # Replace audio.wav with your audio file
+    print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
+    for segment in segments:
+        print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
+    ```
+## Model Details
+* Based on the `vinai/PhoWhisper-large` model.
+* Converted using `ct2-transformers-converter`.
+* Optimized for faster inference with CTranslate2.
+## Contributing
+Contributions are welcome! Please open an issue or submit a pull request.
+## License
+MIT