metadata

license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
datasets:
  - common_voice_15_0
metrics:
  - wer
model-index:
  - name: wav2vec2-xls-r-300m-br
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_15_0
          type: common_voice_15_0
          config: br
          split: None
          args: br
        metrics:
          - name: Wer
            type: wer
            value: 50.08524001794527

wav2vec2-xls-r-300m-br

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_15_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.8404
Wer: 50.0852
Cer: 17.4519

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 300
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Cer	Validation Loss	Wer
6.6871	1.09	500	100.0	3.2774	100.0
3.0612	2.18	1000	99.9339	2.7879	99.9910
1.7934	3.27	1500	29.4362	1.1762	80.5922
1.0914	4.36	2000	25.0591	0.9210	70.7941
0.8895	5.45	2500	23.6321	0.8364	67.1243
0.7831	6.54	3000	22.4169	0.7813	63.9480
0.697	7.63	3500	21.4625	0.7820	61.8214
0.6474	8.71	4000	20.7367	0.7471	59.4437
0.5969	9.8	4500	20.0072	0.7255	57.8914
0.5677	10.89	5000	20.0563	0.7440	57.5774
0.5286	11.98	5500	19.7483	0.7622	56.2494
0.5054	13.07	6000	19.1510	0.7318	55.1548
0.4831	14.16	6500	19.2096	0.7731	54.6882
0.4606	15.25	7000	19.0282	0.7457	54.4459
0.4432	16.34	7500	18.9923	0.7638	54.1319
0.4116	17.43	8000	18.6880	0.7576	53.3692
0.4099	18.52	8500	18.6653	0.7944	53.1359
0.3991	19.61	9000	18.7258	0.8229	52.9296
0.3796	20.7	9500	18.4555	0.8106	52.3194
0.3715	21.79	10000	18.1078	0.7611	51.8798
0.359	22.88	10500	18.4139	0.7921	52.2207
0.3384	23.97	11000	18.0624	0.8022	51.4850
0.3367	25.05	11500	0.7921	51.5209	18.0322
0.3295	26.14	12000	0.8354	51.4491	17.9811
0.3183	27.23	12500	0.8171	51.0991	17.8488
0.3135	28.32	13000	0.8094	50.9915	17.7354
0.309	29.41	13500	0.8632	50.8659	17.7978
0.2922	30.5	14000	0.8268	50.7672	17.6636
0.2987	31.59	14500	0.8108	50.2557	17.5918
0.2914	32.68	15000	0.8237	50.0224	17.4708
0.2893	33.77	15500	0.8450	50.1211	17.3877
0.2853	34.86	16000	0.8354	50.4800	17.5464
0.2791	35.95	16500	0.8424	50.1929	17.5257
0.2732	37.04	17000	0.8390	50.2826	17.5653
0.2691	38.13	17500	0.8420	50.1122	17.4671
0.2702	39.22	18000	0.8404	50.0852	17.4519

Framework versions

Transformers 4.39.1
Pytorch 2.0.1+cu117
Datasets 2.18.0
Tokenizers 0.15.2