cemsubakan commited on
Commit
97464b0
·
1 Parent(s): c0d5002

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ thumbnail:
4
+ tags:
5
+ - Source Separation
6
+ - Speech Separation
7
+ - Audio Source Separation
8
+ - WSJ0-3Mix
9
+ - SepFormer
10
+ - Transformer
11
+ license: "apache-2.0"
12
+ datasets:
13
+ - WSJ0-3Mix
14
+ metrics:
15
+ - SI-SNRi
16
+ - SDRi
17
+
18
+ ---
19
+
20
+ # SepFormer trained on WSJ0-3Mix
21
+
22
+ This repository provides all the necessary tools to perform audio source separation with a [SepFormer](https://arxiv.org/abs/2010.13154v2)
23
+ model, implemented with SpeechBrain, and pretrained on WSJ0-3Mix dataset. For a better experience we encourage you to learn more about
24
+ [SpeechBrain](https://speechbrain.github.io). The given model performance is 19.8 dB SI-SNRi on the test set of WSJ0-3Mix dataset.
25
+
26
+ | Release | Test-Set SI-SNRi | Test-Set SDRi |
27
+ |:-------------:|:--------------:|:--------------:|
28
+ | 09-03-21 | 19.8dB | 20.0dB |
29
+
30
+
31
+ ## Install SpeechBrain
32
+
33
+ First of all, please install SpeechBrain with the following command:
34
+
35
+ ```
36
+ pip install \\we hide ! SpeechBrain is still private :p
37
+ ```
38
+
39
+ Please notice that we encourage you to read our tutorials and learn more about
40
+ [SpeechBrain](https://speechbrain.github.io).
41
+
42
+ ### Transcribing your own audio files
43
+
44
+ ```python
45
+
46
+ from speechbrain.pretrained import separator
47
+ import torchaudio
48
+
49
+ model = separator.from_hparams(source="speechbrain/sepformer-wsj03mix")
50
+
51
+ mix, fs = torchaudio.load("yourspeechbrainpath/samples/audio_samples/test_mixture.wav")
52
+
53
+ est_sources = model.separate(mix)
54
+ est_sources = est_sources / est_sources.max(dim=1, keepdim=True)[0]
55
+
56
+ torchaudio.save("source1hat.wav", est_sources[:, :, 0].detach().cpu(), 8000)
57
+ torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 8000)
58
+ torchaudio.save("source3hat.wav", est_sources[:, :, 2].detach().cpu(), 8000)
59
+
60
+ ```
61
+
62
+ #### Referencing SpeechBrain
63
+
64
+ ```
65
+ @misc{SB2021,
66
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
67
+ title = {SpeechBrain},
68
+ year = {2021},
69
+ publisher = {GitHub},
70
+ journal = {GitHub repository},
71
+ howpublished = {\url{https://github.com/speechbrain/speechbrain}},
72
+ }
73
+ ```
74
+
75
+ #### Referencing SepFormer
76
+ ```
77
+ @inproceedings{subakan2021attention,
78
+ title={Attention is All You Need in Speech Separation},
79
+ author={Cem Subakan and Mirco Ravanelli and Samuele Cornell and Mirko Bronzi and Jianyuan Zhong},
80
+ year={2021},
81
+ booktitle={ICASSP 2021}
82
+ }
83
+ ```