datasets: | |
- AMI | |
language: | |
- en | |
license: apache-2.0 | |
metrics: | |
- | |
name: "IHM test WER" | |
type: wer | |
value: 17.40 | |
- | |
name: "SDM test WER" | |
type: wer | |
value: 32.21 | |
- | |
name: "GSS test WER" | |
type: wer | |
value: 22.43 | |
tags: | |
- k2 | |
- icefall | |
# AMI | |
This is an ASR recipe for the AMI corpus. AMI provides recordings from the speaker's | |
headset and lapel microphones, and also 2 array microphones containing 8 channels each. | |
We pool data in the following 4 ways and train a single model on the pooled data: | |
(i) individual headset microphone (IHM) | |
(ii) IHM with simulated reverb | |
(iii) Single distant microphone (SDM) | |
(iv) GSS-enhanced array microphones | |
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled | |
data. | |
## Performance Record | |
### pruned_transducer_stateless7 | |
The following are decoded using `modified_beam_search`: | |
| Evaluation set | dev WER | test WER | | |
|--------------------------|------------|---------| | |
| IHM | 18.92 | 17.40 | | |
| SDM | 31.25 | 32.21 | | |
| MDM (GSS-enhanced) | 21.67 | 22.43 | | |
See the [recipe](https://github.com/k2-fsa/icefall/tree/master/egs) for details. | |