File size: 2,290 Bytes
713569c
41147d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5175a67
41147d1
713569c
c67c3de
41147d1
b43826c
41147d1
7e2fed9
 
 
 
 
 
 
 
41147d1
 
 
acf2b2f
41147d1
673f09b
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: cc-by-nc-sa-4.0

language:
- lb
- de
- fr
- en
- pt

tags:
- STT
- ASR
- audio
- speech recognition
- coqui.ai

datasets:
- mbarnig/lb-2880-STT-CORPUS

---
#### The luxembourgish part of my multilingual automatic speech recognition (ASR) model is the second Machine Learning (ML) STT model for Luxembourgish. The very first model has been published in May 2022 by [Pr Peter Gilles](https://infolux.uni.lu/automatic-speech-recognition-in-luxembourgish-a-very-first-model/) of the University of Luxembourg. 

#### My model has been trained from scratch with my customized dataset [mbarnig/lb-2880-STT_CORPUS](https://huggingface.co./datasets/mbarnig/lb-2880-STT-CORPUS) and the deep-learning-toolkit 🐸 [Coqui-STT](https://github.com/coqui-ai/STT) (version 1.3.0). The model was trained without punctuations with the following alphabet:   
```   
# Each line in this file represents the Unicode codepoint (UTF-8 encoded)
# associated with a numeric index.
# A line that starts with # is a comment. You can escape it with \# if you wish
# to use '#' in the Alphabet.
 
'abcdefghijklmnopqrstuvwxyz àáâäçèéëîôöûü

# The last (non-comment) line needs to end with a newline.
```   
#### A live inference-demo of the ASR system is available in my HuggingFace space ⌨️ 🇱🇺 🔈 [mbarnig/lb-de-fr-en-pt-COQUI-STT](https://huggingface.co./spaces/mbarnig/lb-de-fr-en-pt-COQUI-STT).

#### Click the tab *training metrics* above to view the live Tensorboard of the model training with the small (2880 samples), with the expanded (27072 samples) dataset, each with and without data augmentation.

![tensorboard](tensorboard/tensorboard-comparison.png)

#### The speech recognition models for the other languages have been released by Coqui.ai in the [model zoo](https://coqui.ai/models). I use the following versions in my ASR system:   
* [French STT v0.9](https://coqui.ai/french/commonvoice-fr/v0.9) Dataset : common-voice.fr
* [German STT v0.9](https://coqui.ai/german/AASHISHAG/v0.9.0) Datasets : Common Voice 5.1,  SWC , MAILABS, Tuda-De, Voxforge
* [English STT huge vocab v1.0](https://coqui.ai/english/coqui/v1.0.0-huge-vocab) Datasets : Common Voice 7.0, Librispeech
* [Portuguese STT v0.1.1](https://coqui.ai/portuguese/itml/v0.1.1) Dataset : Common Voice 6.1