File size: 2,310 Bytes
3c4c02b 26bea3f dd9a6f0 26bea3f 3c4c02b 26bea3f 3c4c02b 26bea3f 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 5072137 cf36d6a 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b d0f95e6 790c92d 3c4c02b 65a2cf9 790c92d 3c4c02b 790c92d 3c4c02b 790c92d 3c4c02b 790c92d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
library_name: transformers
tags:
- persian
- whisper-base
- whisper
- farsi
- Neura
- NeuraSpeech
license: apache-2.0
language:
- fa
pipeline_tag: automatic-speech-recognition
---
#
<p align="center">
<img src="neura_speech_2.png" width=512 height=256 />
</p>
<!-- Provide a quick summary of what the model is/does. -->
## Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Neura company
- **Funded by:** Neura
- **Model type:** Whisper Base
- **Language(s) (NLP):** Persian
## Model Architecture
Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model.
It is a pre-trained model for automatic speech recognition (ASR) and speech translation.
## Uses
Check out the Google Colab demo to run NeuraSpeech ASR on a free-tier Google Colab instance: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12d7zecB94ah7ZHKnDtJF58saLzdkZAj3#scrollTo=oNt032WVkQUa)
make sure these packages are installed:
```python
from IPython.display import Audio, display
display(Audio('persian_audio.mp3', rate = 32_000,autoplay=True))
```
```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
# load model and processor
processor = WhisperProcessor.from_pretrained("Neurai/NeuraSpeech_WhisperBase")
model = WhisperForConditionalGeneration.from_pretrained("Neurai/NeuraSpeech_WhisperBase")
forced_decoder_ids = processor.get_decoder_prompt_ids(language="fa", task="transcribe")
array, sample_rate = librosa.load('persian_audio.mp3')
sr = 16000
array = librosa.to_mono(array)
array = librosa.resample(array, orig_sr=sample_rate, target_sr=16000)
input_features = processor(array, sampling_rate=sr, return_tensors="pt").input_features
# generate token ids
predicted_ids = model.generate(input_features)
# decode token ids to text
transcription = processor.batch_decode(predicted_ids,)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription)
```
trascribed text :
```
او خواهان آزاد کردن بردگان بود
```
## More Information
https://neura.info
## Model Card Authors
Esmaeil Zahedi, Mohsen Yazdinejad
## Model Card Contact
[email protected] |