An Anechoic, High-Fidelity, Multidirectional Speech Corpus.

Autor: Miller MK; Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE., Delaram V; Department of Speech & Hearing Science, University of Illinois Urbana-Champaign., Trine A; Department of Speech & Hearing Science, University of Illinois Urbana-Champaign., Ananthanarayana RM; Department of Speech & Hearing Science, University of Illinois Urbana-Champaign., Buss E; Department of Otolaryngology/Head & Neck Surgery, University of North Carolina at Chapel Hill., Monson BB; Department of Speech & Hearing Science, University of Illinois Urbana-Champaign.; Department of Biomedical and Translational Sciences, Carle Illinois College of Medicine, University of Illinois Urbana-Champaign., Stecker GC; Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE.
Jazyk: angličtina
Zdroj: Journal of speech, language, and hearing research : JSLHR [J Speech Lang Hear Res] 2025 Jan 02; Vol. 68 (1), pp. 411-418. Date of Electronic Publication: 2024 Dec 02.
DOI: 10.1044/2024_JSLHR-24-00296
Abstrakt: Introduction: We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners.
Design: Fifteen male and 15 female talkers (21.3-60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0-10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°-180° azimuth angle around the talker using a 48 kHz sampling rate.
Results: Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers.
Conclusions: The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.
Databáze: MEDLINE