|
--- |
|
license: cc-by-4.0 |
|
language: |
|
- mk |
|
base_model: |
|
- openai/whisper-large-v3 |
|
--- |
|
|
|
# Fine-tuned whisper-large-v3 model for speech recognition in Macedonian |
|
|
|
Authors: |
|
1. Dejan Porjazovski |
|
2. Ilina Jakimovska |
|
3. Ordan Chukaliev |
|
4. Nikola Stikov |
|
|
|
This collaboration is part of the activities of the Center for Advanced Interdisciplinary Research (CAIR) at UKIM. |
|
|
|
## Data used for training |
|
|
|
In training of the model, we used the following data sources: |
|
1. Digital Archive for Ethnological and Anthropological Resources (DAEAR) at the Institutе of Ethnology and Anthropology, PMF, UKIM. |
|
2. Audio version of the international journal "EthnoAnthropoZoom" at the Institutе of Ethnology and Anthropology, PMF, UKIM. |
|
3. The podcast "Обични луѓе" by Ilina Jakimovska. |
|
4. The scientific videos from the series "Наука за деца", foundation KANTAROT. |
|
5. Macedonian version of the Mozilla Common Voice (version 18). |
|
|
|
|
|
## Usage |
|
|
|
```python |
|
from speechbrain.inference.interfaces import foreign_class |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
asr_classifier = foreign_class(source="Macedonian-ASR/whisper-large-v3-macedonian-asr", pymodule_file="custom_interface.py", classname="ASR") |
|
asr_classifier = asr_classifier.to(device) |
|
predictions = asr_classifier.classify_file("audio_file.wav", device) |
|
print(predictions) |
|
``` |