This is a pre-trained version of Fast FullSubNet, a real-time denoising model trained on the Deep Noise Suppression Challenge dataset of 2020 (DNS-INTERSPEECH-2020).

How to run

https://fullsubnet.readthedocs.io/en/latest/usage/getting_started.html

Code

https://github.com/Audio-WestlakeU/FullSubNet

Note: The code doesn't support real-time streaming out of the box. See issue-67 for details.

Paper

Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement, Xiang Hao, Xiaofei Li

For many speech enhancement applications, a key feature is that system runs on a real-time, latency-sensitive, battery-powered platform, which strictly limits the algorithm latency and computational complexity. In this work, we propose a new architecture named Fast FullSubNet dedicated to accelerating the computation of FullSubNet. Specifically, Fast FullSubNet processes sub-band speech spectra in the mel-frequency domain by using cascaded linear-to-mel full-band, sub-band, and mel-to-linear full-band models such that frequencies involved in the sub-band computation are vastly reduced. After that, a down-sampling operation is proposed for the sub-band input sequence to further reduce the computational complexity along the time axis. Experimental results show that, compared to FullSubNet, Fast FullSubNet has only 13% computational complexity and 16% processing time, and achieves comparable or even better performance.

Performance

With Reverb No Reverb
Method WB-PESQ NB-PESQ SI-SDR STOI WB-PESQ NB-PESQ SI-SDR
Fast FullSubNet (118 Epochs) 2.882 3.42 15.33 0.9233 2.694 3.222 16.34
FullSubNet (58 Epochs) (just for comparison) 2.987 3.496 15.756 0.926 2.889 3.385 17.635
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.