File size: 1,232 Bytes
a315e62 b26d72c a315e62 b26d72c 06747da b26d72c 06747da b26d72c 06747da b26d72c a315e62 9a63d31 06747da 9a63d31 06747da 9a63d31 06747da |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
datasets:
- AMI
language:
- en
license: apache-2.0
metrics:
-
name: "IHM test WER"
type: wer
value: 17.40
-
name: "SDM test WER"
type: wer
value: 32.21
-
name: "GSS test WER"
type: wer
value: 22.43
tags:
- k2
- icefall
---
# AMI
This is an ASR recipe for the AMI corpus. AMI provides recordings from the speaker's
headset and lapel microphones, and also 2 array microphones containing 8 channels each.
We pool data in the following 4 ways and train a single model on the pooled data:
(i) individual headset microphone (IHM)
(ii) IHM with simulated reverb
(iii) Single distant microphone (SDM)
(iv) GSS-enhanced array microphones
Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled
data.
## Performance Record
### pruned_transducer_stateless7
The following are decoded using `modified_beam_search`:
| Evaluation set | dev WER | test WER |
|--------------------------|------------|---------|
| IHM | 18.92 | 17.40 |
| SDM | 31.25 | 32.21 |
| MDM (GSS-enhanced) | 21.67 | 22.43 |
See the [recipe](https://github.com/k2-fsa/icefall/tree/master/egs) for details.
|