|
--- |
|
license: cc-by-sa-4.0 |
|
language: |
|
- en |
|
- zh |
|
metrics: |
|
- f1 |
|
library_name: transformers |
|
pipeline_tag: audio-classification |
|
tags: |
|
- speech-emotion-recognition |
|
--- |
|
|
|
# Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition |
|
|
|
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co./facebook/wav2vec2-large-xlsr-53) on English and Chinese data from elderly speakers. |
|
The model is trained on the training sets of [CREMA-D](https://github.com/CheyneyComputerScience/CREMA-D), [CSED](https://github.com/AkishinoShiame/Chinese-Speech-Emotion-Datasets), [ElderReact](https://github.com/Mayer123/ElderReact), and [TESS](https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess). |
|
When using this model, make sure that your speech input is sampled at 16kHz. |
|
|
|
The scripts used for training and evaluation can be found here: |
|
[https://github.com/HLTCHKUST/elderly_ser/tree/main](https://github.com/HLTCHKUST/elderly_ser/tree/main) |
|
|
|
## Evaluation Results |
|
|
|
For the details (e.g., the statistics of `train`, `valid`, and `test` data), please refer to our paper on [arXiv](https://arxiv.org/abs/2306.14517). |
|
It also provides the model's speech emotion recognition performances on: English-All, Chinese-All, English-Elderly, Chinese-Elderly, English-Adults, Chinese-Adults. |
|
|
|
## Citation |
|
|
|
Our paper will be published at INTERSPEECH 2023. In the meantime, you can find our paper on [arXiv](https://arxiv.org/abs/2306.14517). |
|
If you find our work useful, please consider citing our paper as follows: |
|
``` |
|
@misc{cahyawijaya2023crosslingual, |
|
title={Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition}, |
|
author={Samuel Cahyawijaya and Holy Lovenia and Willy Chung and Rita Frieske and Zihan Liu and Pascale Fung}, |
|
year={2023}, |
|
eprint={2306.14517}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |