Semantic and Acoustic Markers in Schizophrenia-Spectrum Disorders: A Combinatory Machine Learning Approach

Autor:	Alban E Voppel, Janna N de Boer, Sanne G Brederoo, Hugo G Schnack, Iris E C Sommer
Přispěvatelé:	Clinical Cognitive Neuropsychiatry Research Program (CCNP), Movement Disorder (MD)
Rok vydání:	2022
Předmět:	Psychiatry and Mental health language classification speech ensemble learning psychosis
Zdroj:	Schizophrenia Bulletin, 49(2), S163-S171. Oxford University Press
ISSN:	1745-1701 0586-7614
Popis:	Background and hypothesis Speech is a promising marker to aid diagnosis of schizophrenia-spectrum disorders, as it reflects symptoms like thought disorder and negative symptoms. Previous approaches made use of different domains of speech for diagnostic classification, including features like coherence (semantic) and form (acoustic). However, an examination of the added value of each domain when combined is lacking as of yet. Here, we investigate the acoustic and semantic domains separately and combined. Study design Using semi-structured interviews, speech of 94 subjects with schizophrenia-spectrum disorders (SSD) and 73 healthy controls (HC) was recorded. Acoustic features were extracted using a standardized feature-set, and transcribed interviews were used to calculate semantic word similarity using word2vec. Random forest classifiers were trained for each domain. A third classifier was used to combine features from both domains; 10-fold cross-validation was used for each model. Results The acoustic random forest classifier achieved 81% accuracy classifying SSD and HC, while the semantic domain classifier reached an accuracy of 80%. Joining features from the two domains, the combined classifier reached 85% accuracy, significantly improving on separate domain classifiers. For the combined classifier, top features were fragmented speech from the acoustic domain and variance of similarity from the semantic domain. Conclusions Both semantic and acoustic analyses of speech achieved ~80% accuracy in classifying SSD from HC. We replicate earlier findings per domain, additionally showing that combining these features significantly improves classification performance. Feature importance and accuracy in combined classification indicate that the domains measure different, complementing aspects of speech.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4ce5eee14dfe7835c01b468e2a81eed1 https://doi.org/10.1093/schbul/sbac142 Zobrazit plný text záznamu