SSL4PR WavLM Base and HuBERT Base Models

This repository hosts the pre-trained SSL4PR models for Parkinson's Disease detection from speech in real-world operating conditions. These models are based on the work titled "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions" by La Quatra et al. published at Interspeech 2024. Paper Link

Repository Link

GitHub Repository please refer to the repository for all details on the models, training and usage.

Pre-trained Models

Pre-trained models are available on the Hugging Face model hub. To use the SSL4PR models, please clone the desired repository by running one of the following commands:

# For fold-based models (10-fold cross-validation)
git clone https://huggingface.co./morenolq/SSL4PR-wavlm-base
git clone https://huggingface.co./morenolq/SSL4PR-hubert-base

# For full training models (trained on complete s-PC-GITA)
git clone https://huggingface.co./morenolq/SSL4PR-wavlm-base-full
git clone https://huggingface.co./morenolq/SSL4PR-hubert-base-full

Ensure you have git lfs installed.

Fold-based Models

The fold-based repositories contain models trained using 10-fold cross-validation on s-PC-GITA. Each repository contains 10 pre-trained models, one per fold, named fold_1.pt, fold_2.pt, ..., fold_10.pt.

Full Training Models

The full training repositories contain models trained on the complete s-PC-GITA dataset and tested on enhanced e-PC-GITA (as reported in Table 3 of the paper). Each repository contains a single model file named model.pt.

All models are available in PyTorch format. ⚠️ Please note that the models are not directly compatible with the Hugging Face Transformers library because they are trained using specific head components (i.e., attention pooling, layer weighting...) as you can find in the model class

An image of the model architecture below:

Model Architecture

Citation

@inproceedings{laquatra24_interspeech,
  title     = {Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions},
  author    = {Moreno {La Quatra} and Maria Francesca Turco and Torbjørn Svendsen and Giampiero Salvi and Juan Rafael Orozco-Arroyave and Sabato Marco Siniscalchi},
  year      = {2024},
  booktitle = {Interspeech 2024},
  pages     = {1405--1409},
  doi       = {10.21437/Interspeech.2024-522},
  issn      = {2958-1796},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including morenolq/SSL4PR-wavlm-base-full