End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge

Autor: Sheikh, Shakeel Ahmad, Sahidullah, Md, Hirsch, Fabrice, Ouni, Slim
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: In this paper, we present end-to-end and speech embedding based systems trained in a self-supervised fashion to participate in the ACM Multimedia 2022 ComParE Challenge, specifically the stuttering sub-challenge. In particular, we exploit the embeddings from the pre-trained Wav2Vec2.0 model for stuttering detection (SD) on the KSoF dataset. After embedding extraction, we benchmark with several methods for SD. Our proposed self-supervised based SD system achieves a UAR of 36.9% and 41.0% on validation and test sets respectively, which is 31.32% (validation set) and 1.49% (test set) higher than the best (DeepSpectrum) challenge baseline (CBL). Moreover, we show that concatenating layer embeddings with Mel-frequency cepstral coefficients (MFCCs) features further improves the UAR of 33.81% and 5.45% on validation and test sets respectively over the CBL. Finally, we demonstrate that the summing information across all the layers of Wav2Vec2.0 surpasses the CBL by a relative margin of 45.91% and 5.69% on validation and test sets respectively. Grand-challenge: Computational Paralinguistics ChallengE
Comment: Accepted in ACM MM 2022 Conference : Grand Challenges, "\c{opyright} {Owner/Author | ACM} {2022}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution
Databáze: arXiv