Automatic detection of prosodic boundaries in spontaneous speech.

Autor: Biron T; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel., Baum D; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel., Freche D; Sagol Center for Brain and Mind, Interdisciplinary Center, Herzliya, Israel., Matalon N; Department of Linguistics, The Hebrew University, Jerusalem, Israel., Ehrmann N; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel., Weinreb E; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel., Biron D; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel., Moses E; Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, Israel.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2021 May 03; Vol. 16 (5), pp. e0250969. Date of Electronic Publication: 2021 May 03 (Print Publication: 2021).
DOI: 10.1371/journal.pone.0250969
Abstrakt: Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters.
Competing Interests: The authors have declared that no competing interests exist.
Databáze: MEDLINE