You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
The collected information will help acquire a better knowledge of this model userbase and help its maintainers apply for grants to improve it further.
Log in or Sign Up to review the conditions and access this model content.
ποΈπ₯π¨π Brouhaha
Joint voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
TL;DR | Paper | Code | And Now for Something Completely Different
Installation
This model relies on pyannote.audio and brouhaha-vad.
pip install pyannote-audio
pip install https://github.com/marianne-m/brouhaha-vad/archive/main.zip
Usage
# 1. visit hf.co/pyannote/brouhaha and accept user conditions
# 2. visit hf.co/settings/tokens to create an access token
# 3. instantiate pretrained model
from pyannote.audio import Model
model = Model.from_pretrained("pyannote/brouhaha",
use_auth_token="ACCESS_TOKEN_GOES_HERE")
# apply model
from pyannote.audio import Inference
inference = Inference(model)
output = inference("audio.wav")
# iterate over each frame
for frame, (vad, snr, c50) in output:
t = frame.middle
print(f"{t:8.3f} vad={100*vad:.0f}% snr={snr:.0f} c50={c50:.0f}")
# ...
# 12.952 vad=100% snr=51 c50=17
# 12.968 vad=100% snr=52 c50=17
# 12.985 vad=100% snr=53 c50=17
# ...
Citation
@article{lavechin2022brouhaha,
Title = {{Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation}},
Author = {Marvin Lavechin and Marianne Métais and Hadrien Titeux and Alodie Boissonnet and Jade Copet and Morgane Rivière and Elika Bergelson and Alejandrina Cristia and Emmanuel Dupoux and Hervé Bredin},
Year = {2022},
Journal = {arXiv preprint arXiv: Arxiv-2210.13248}
}
```bibtex
@inproceedings{Bredin2020,
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
Address = {Barcelona, Spain},
Month = {May},
Year = {2020},
}
- Downloads last month
- 66,764
Inference API (serverless) does not yet support pyannote-audio models for this pipeline type.