Validation of an AI-assisted Treatment Outcome Measure for Gender-Affirming Voice Care: Comparing AI Accuracy to Listener's Perception of Voice Femininity.
Autor: | Simon S; Departement of Otolaryngology-Head & Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., Silverstein E; Departement of Otolaryngology-Head & Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., Timmons-Sund L; Caruso Department of Otolaryngology, Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., Pinto JM; MILA Institute for Artificial Intelligence, Montreal, Canada., Castro EM; Caruso Department of Otolaryngology, Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., O'Dell K; Caruso Department of Otolaryngology, Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., Johns Iii MM; Caruso Department of Otolaryngology, Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California., Mack WJ; Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, California., Bensoussan Y; University of South Florida, Department of Otolaryngology, Head & Neck Surgery, Tampa, Florida.. Electronic address: yaelbensoussan@usf.edu. |
---|---|
Jazyk: | angličtina |
Zdroj: | Journal of voice : official journal of the Voice Foundation [J Voice] 2023 Dec 29. Date of Electronic Publication: 2023 Dec 29. |
DOI: | 10.1016/j.jvoice.2023.12.008 |
Abstrakt: | Objectives: There is currently a lack of objective treatment outcome measures for transgender individuals undergoing gender-affirming voice care. Recently, Bensoussan et al developed an AI model that is able to generate a voice femininity rating based on a short voice sample provided through a smartphone application. The purpose of this study was to examine the feasibility of using this model as a treatment outcome measure by comparing its performance to human listeners. Additionally, we examined the effect of two different training datasets on the model's accuracy and performance when presented with external data. Methods: 100 voice recordings from 50 cisgender males and 50 cisgender females were retrospectively collected from patients presenting at a university voice clinic for reasons other than dysphonia. The recordings were evaluated by expert and naïve human listeners, who rated each voice based on how sure they were the voice belonged to a female speaker (% voice femininity [R]). Human ratings were compared to ratings generated by (1) the AI model trained on a high-quality low-quantity dataset (voices from the Perceptual Voice Quality Database) (PVQD model), and (2) the AI model trained on a low-quality high-quantity dataset (voices from the Mozilla Common Voice database) (Mozilla model). Ambiguity scores were calculated as the absolute value of the difference between the rating and certainty (0 or 100%). Results: Both expert and naïve listeners achieved 100% accuracy in identifying voice gender based on a binary classification (female >50% voice femininity [R]). In comparison, the Mozilla-trained model achieved 92% accuracy and the previously published PVQD model achieved 84% accuracy in determining voice gender (female >50% AI voice femininity). While both AI models correlated with human ratings, the Mozilla-trained model showed a stronger correlation as well as lower overall rating ambiguity than the PVQD-trained model. The Mozilla model also appeared to handle pitch information in a similar way to human raters. Conclusions: The AI model predicted voice gender with high accuracy when compared to human listeners and has potential as a useful outcome measure for transgender individuals receiving gender-affirming voice training. The Mozilla-trained model performed better than the PVQD-trained model, indicating that for binary classification tasks, the quantity of data may influence accuracy more than the quality of the data used for training the voice AI models. Competing Interests: Declaration of Competing Interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests. (Copyright © 2023 The Voice Foundation. Published by Elsevier Inc. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |