jlondonobo
commited on
Commit
•
469502d
1
Parent(s):
673eaaf
Add detail to README
Browse files
README.md
CHANGED
@@ -27,42 +27,37 @@ model-index:
|
|
27 |
value: 6.5785713084850626
|
28 |
---
|
29 |
|
30 |
-
|
31 |
-
should probably proofread and complete it, then remove this comment. -->
|
32 |
|
33 |
-
|
34 |
|
35 |
-
|
36 |
-
It achieves the following results on the evaluation set:
|
37 |
-
- Loss: 0.3205
|
38 |
-
- Wer: 6.5786
|
39 |
|
40 |
-
|
41 |
|
42 |
-
|
43 |
|
44 |
-
|
45 |
|
46 |
-
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
-
## Training and evaluation data
|
49 |
-
|
50 |
-
More information needed
|
51 |
-
|
52 |
-
## Training procedure
|
53 |
|
54 |
### Training hyperparameters
|
55 |
-
|
56 |
-
|
57 |
-
-
|
58 |
-
-
|
59 |
-
-
|
60 |
-
-
|
61 |
-
-
|
62 |
-
-
|
63 |
-
-
|
64 |
-
-
|
65 |
-
- mixed_precision_training: Native AMP
|
66 |
|
67 |
### Training results
|
68 |
|
@@ -72,7 +67,7 @@ The following hyperparameters were used during training:
|
|
72 |
| 0.0218 | 3.07 | 2000 | 0.2254 | 7.1098 |
|
73 |
| 0.0053 | 5.06 | 3000 | 0.2711 | 6.9686 |
|
74 |
| 0.0017 | 7.04 | 4000 | 0.3030 | 6.6862 |
|
75 |
-
| 0.0005 | 9.02 | 5000 | 0.3205 | 6.5786 |
|
76 |
|
77 |
|
78 |
### Framework versions
|
@@ -80,4 +75,4 @@ The following hyperparameters were used during training:
|
|
80 |
- Transformers 4.26.0.dev0
|
81 |
- Pytorch 1.13.0+cu117
|
82 |
- Datasets 2.7.1.dev0
|
83 |
-
- Tokenizers 0.13.2
|
|
|
27 |
value: 6.5785713084850626
|
28 |
---
|
29 |
|
30 |
+
# Whisper Medium Portuguese 🇧🇷🇵🇹
|
|
|
31 |
|
32 |
+
Bem-vindo ao whisper medium para transcrição em português 👋🏻
|
33 |
|
34 |
+
If you are looking to **quickly**, and **reliably**, transcribe portuguese audio to text, you are in the right place!
|
|
|
|
|
|
|
35 |
|
36 |
+
With a state-of-the-art [Word Error Rate](https://huggingface.co/spaces/evaluate-metric/wer) (WER) of just **6.58** in Common Voice 11, this model shows increases in precision of more than **x2** compared to past state of the art [wav2vec2](https://huggingface.co/Edresson/wav2vec2-large-xlsr-coraa-portuguese) models. When compared to the original [whisper-medium](https://huggingface.co/openai/whisper-medium) model it shows a **x1.2** improvement 🚀.
|
37 |
|
38 |
+
This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the [mozilla-foundation/common_voice_11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0) dataset.
|
39 |
|
40 |
+
The following table shows a **comparison** between the results of our model and those achieved by the most downloaded models in the hub for portuguese Automatic Speech Recognition:
|
41 |
|
42 |
+
| Model | WER | Parameters |
|
43 |
+
|--------------------------------------------------|:--------:|:------------:|
|
44 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 8.10 | 769M |
|
45 |
+
| [jlondonobo/whisper-medium-pt](https://huggingface.co/jlondonobo/whisper-medium-pt) | **6.58** 🤗 | 769M |
|
46 |
+
| [jonatasgrosman/wav2vec2-large-xlsr-53-portuguese](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-portuguese) | 11.31 | 317M |
|
47 |
+
| [Edresson/wav2vec2-large-xlsr-coraa-portuguese](https://huggingface.co/Edresson/wav2vec2-large-xlsr-coraa-portuguese) | 20.08 | 317M |
|
48 |
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
### Training hyperparameters
|
51 |
+
We used the following hyperparameters for training:
|
52 |
+
- `learning_rate`: 1e-05
|
53 |
+
- `train_batch_size`: 32
|
54 |
+
- `eval_batch_size`: 16
|
55 |
+
- `seed`: 42
|
56 |
+
- `optimizer`: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
57 |
+
- `lr_scheduler_type`: linear
|
58 |
+
- `lr_scheduler_warmup_steps`: 500
|
59 |
+
- `training_steps`: 5000
|
60 |
+
- `mixed_precision_training`: Native AMP
|
|
|
61 |
|
62 |
### Training results
|
63 |
|
|
|
67 |
| 0.0218 | 3.07 | 2000 | 0.2254 | 7.1098 |
|
68 |
| 0.0053 | 5.06 | 3000 | 0.2711 | 6.9686 |
|
69 |
| 0.0017 | 7.04 | 4000 | 0.3030 | 6.6862 |
|
70 |
+
| 0.0005 | 9.02 | 5000 | 0.3205 | **6.5786** 🤗 |
|
71 |
|
72 |
|
73 |
### Framework versions
|
|
|
75 |
- Transformers 4.26.0.dev0
|
76 |
- Pytorch 1.13.0+cu117
|
77 |
- Datasets 2.7.1.dev0
|
78 |
+
- Tokenizers 0.13.2
|