desh2608's picture
update with averaged model
06747da
---
datasets:
- AMI
language:
- en
license: apache-2.0
metrics:
-
name: "IHM test WER"
type: wer
value: 17.40
-
name: "SDM test WER"
type: wer
value: 32.21
-
name: "GSS test WER"
type: wer
value: 22.43
tags:
- k2
- icefall
---
# AMI
This is an ASR recipe for the AMI corpus. AMI provides recordings from the speaker's
headset and lapel microphones, and also 2 array microphones containing 8 channels each.
We pool data in the following 4 ways and train a single model on the pooled data:
(i) individual headset microphone (IHM)
(ii) IHM with simulated reverb
(iii) Single distant microphone (SDM)
(iv) GSS-enhanced array microphones
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled
data.
## Performance Record
### pruned_transducer_stateless7
The following are decoded using `modified_beam_search`:
| Evaluation set | dev WER | test WER |
|--------------------------|------------|---------|
| IHM | 18.92 | 17.40 |
| SDM | 31.25 | 32.21 |
| MDM (GSS-enhanced) | 21.67 | 22.43 |
See the [recipe](https://github.com/k2-fsa/icefall/tree/master/egs) for details.