Performance evaluation of human cough annotators: optimal metrics and sex differences.

Autor: Sanchez-Olivieri I; Universidad de Navarra, Pamplona, Spain., Rudd M; Hyfe Inc, Wilmington, Delaware, USA., Gabaldon-Figueira JC; ISGlobal, Barcelona institute for Global Health, Barcelona, Spain., Carmona-Torre F; Universidad de Navarra, Pamplona, Spain., Del Pozo JL; Universidad de Navarra, Pamplona, Spain., Moorsmith R; Hyfe Inc, Wilmington, Delaware, USA., Jover L; Hyfe Inc, Wilmington, Delaware, USA., Galvosas M; Hyfe Inc, Wilmington, Delaware, USA., Small P; Hyfe Inc, Wilmington, Delaware, USA., Grandjean Lapierre S; Dept of Microbiology, Infectious Diseases and Immunology, Research Center of the University of Montreal Hospital Center, Montreal, Quebec, Canada.; Immunopathology Axis, Research Center of the University of Montreal Hospital Center, Montreal, Quebec, Canada., Chaccour C; Universidad de Navarra, Pamplona, Spain carlos.chaccour@isglobal.org.; ISGlobal, Barcelona institute for Global Health, Barcelona, Spain.; Centro de investigación biomédica en red enfermedades infecciosas, Madrid, Spain.
Jazyk: angličtina
Zdroj: BMJ open respiratory research [BMJ Open Respir Res] 2023 Nov; Vol. 10 (1).
DOI: 10.1136/bmjresp-2023-001942
Abstrakt: Introduction: Despite its high prevalence and significance, there is still no widely available method to quantify cough. In order to demonstrate agreement with the current gold standard of human annotation, emerging automated techniques require a robust, reproducible approach to annotation. We describe the extent to which a human annotator of cough sounds (a) agrees with herself (intralabeller or intrarater agreement) and (b) agrees with other independent labellers (interlabeller or inter-rater agreement); we go on to describe significant sex differences in cough sound length and epochs size.
Materials and Methods: 24 participants wore an audiorecording smartwatch to capture 6-24 hours of continuous audio. A randomly selected sample of the whole audio was labelled twice by an expert annotator and a third time by six trained annotators. We collected 400 hours of audio and analysed 40 hours. The cough counts as well as cough seconds (any 1 s of time containing at least one cough) from different annotators were compared and summary statistics from linear and Bland-Altman analyses were used to quantify intraobserver and interobserver agreement.
Results: There was excellent intralabeller (less than two disagreements per hour monitored, Pearson's correlation 0.98) and interlabeller agreement (Pearson's correlation 0.96), using cough seconds as the unit of analysis decreased annotator discrepancies by 50% in comparison to coughs. Within this data set, it was observed that the length of cough sounds and epoch size (number of coughs per bout or attach) differed between women and men.
Conclusion: Given the decreased interobserver variability in annotation when using cough seconds (vs just coughs) we propose their use for manually annotating cough when assessing of the performance of automatic cough monitoring systems. The differences in cough sound length and epochs size may have important implications for equality in the development of cough monitoring tools.
Trial Registration Number: NCT05042063.
Competing Interests: Competing interests: MR, JCG, RM, LJ, MG and PS were or are employees of Hyfe and own equity in Hyfe. CCh has received consultancy fees and owns equity in Hyfe. All other authors declare no conflict of interest.
(© Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.)
Databáze: MEDLINE