Edit model card

speech-emotion-recognition-wav2vec2

This model is a fine-tuned version of jonatasgrosman/wav2vec2-large-xlsr-53-english on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2842
  • Accuracy: 0.9045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.1026 0.0236 10 2.0265 0.1592
1.9631 0.0472 20 2.0125 0.1993
1.9106 0.0708 30 1.8609 0.2417
1.715 0.0943 40 1.7659 0.3054
1.69 0.1179 50 1.5524 0.3785
1.4684 0.1415 60 1.4516 0.4057
1.3422 0.1651 70 1.2702 0.5354
1.2358 0.1887 80 0.9599 0.6899
0.9937 0.2123 90 0.8447 0.7394
0.7604 0.2358 100 0.8068 0.7453
0.7736 0.2594 110 0.6561 0.7913
0.6573 0.2830 120 0.6584 0.7830
0.5634 0.3066 130 0.5564 0.8066
0.5353 0.3302 140 0.5586 0.8184
0.3805 0.3538 150 0.6575 0.7818
0.6584 0.3774 160 0.4686 0.8538
0.4788 0.4009 170 0.4533 0.8514
0.4123 0.4245 180 0.5266 0.8432
0.4964 0.4481 190 0.5038 0.8325
0.4489 0.4717 200 0.5552 0.8208
0.4562 0.4953 210 0.4075 0.8526
0.5362 0.5189 220 0.4975 0.8184
0.3539 0.5425 230 0.4947 0.8267
0.4726 0.5660 240 0.4456 0.8514
0.3897 0.5896 250 0.3567 0.8715
0.2817 0.6132 260 0.3880 0.8644
0.3281 0.6368 270 0.3902 0.8679
0.311 0.6604 280 0.3243 0.9021
0.1768 0.6840 290 0.4162 0.8644
0.3748 0.7075 300 0.4482 0.8644
0.588 0.7311 310 0.3179 0.8950
0.402 0.7547 320 0.2955 0.9033
0.4068 0.7783 330 0.3212 0.8962
0.3622 0.8019 340 0.3931 0.8550
0.4407 0.8255 350 0.3467 0.8644
0.3474 0.8491 360 0.3149 0.8962
0.3449 0.8726 370 0.2829 0.9033
0.2673 0.8962 380 0.2566 0.9198
0.2998 0.9198 390 0.2614 0.9127
0.2721 0.9434 400 0.2786 0.9021
0.2717 0.9670 410 0.2891 0.9021
0.3277 0.9906 420 0.2842 0.9045

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
316M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for abdulelahagr/speech-emotion-recognition-wav2vec2

Finetuned
(22)
this model

Space using abdulelahagr/speech-emotion-recognition-wav2vec2 1