Additive Cross-Modal Attention Network (ACMA) for Depression Detection Based on Audio and Textual Features

Autor: Ngumimi Karen Iyortsuun, Soo-Hyung Kim, Hyung-Jeong Yang, Seung-Won Kim, Min Jhon
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: IEEE Access, Vol 12, Pp 20479-20489 (2024)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2024.3362233
Popis: Detecting depression involves using standardized questionnaires like the Patient Health Questionnaires (PHQ-8/9). Yet, patients might not always provide genuine responses, leading to potential misdiagnoses. Therefore, the need for a means to detect depression in patients without the use of preset questions is of high importance. Addressing this challenge, our study aims to discern telltale symptoms from statements made by the patient. We harness both audio and text data, proposing an Additive cross-modal attention network to learn and pick up the appropriate weights that best capture the cross-modal interactions and relationships between both features using BiLSTM as the backbone of both modalities. We tested our approach on the DAIC-WOZ dataset for depression detection and also evaluated our model performance on the EATD-Corpus. Benchmarked against similar studies on these datasets, our method demonstrates commendable efficacy in both classification and regression models for both unimodal and multimodal approaches. Our findings underscore the potential of our model to effectively detect depression in patients while using textual and speech modalities without the necessary use of preset questions for effective detection.
Databáze: Directory of Open Access Journals