desh2608's picture
update with averaged model
06747da
metadata
datasets:
  - AMI
language:
  - en
license: apache-2.0
metrics:
  - name: IHM test WER
    type: wer
    value: 17.4
  - name: SDM test WER
    type: wer
    value: 32.21
  - name: GSS test WER
    type: wer
    value: 22.43
tags:
  - k2
  - icefall

AMI

This is an ASR recipe for the AMI corpus. AMI provides recordings from the speaker's headset and lapel microphones, and also 2 array microphones containing 8 channels each. We pool data in the following 4 ways and train a single model on the pooled data:

(i) individual headset microphone (IHM) (ii) IHM with simulated reverb (iii) Single distant microphone (SDM) (iv) GSS-enhanced array microphones

Speed perturbation and MUSAN noise augmentation are additionally performed on the pooled data.

Performance Record

pruned_transducer_stateless7

The following are decoded using modified_beam_search:

Evaluation set dev WER test WER
IHM 18.92 17.40
SDM 31.25 32.21
MDM (GSS-enhanced) 21.67 22.43

See the recipe for details.