When noise vocoding can improve the intelligibility of sub-critical band speech.

Autor: Bashford Jr., James A., Warren, Richard M., Lenz, Peter W.
Zdroj: Proceedings of Meetings on Acoustics; 2010, Vol. 9 Issue 1, p060001, 9p
Abstrakt: This study examined the redundancy of spectral and temporal information in everyday sentences, which were reduced to 16 rectangular spectral bands having center frequencies ranging from 250 to 8000 Hz, spaced at 1/3 octave intervals. High-order filtering eliminated contributions from transition bands, and the widths of the resulting effectively rectangular speech bands were varied from 4% down to 0.5%. Intelligibility of these sub-critical bandwidth stimuli ranged from nearly perfect in the 4% bandwidth conditions, down to nearly zero in the 0.5% bandwidth conditions. However, a large intelligibility increase was obtained under the narrower filtering conditions when the speech bands were used to vocode broader noise bands that approximated critical bandwidths (ERBn) at the 16 center frequencies. For example, the 0.5%- and 1%-bandwidth speech stimuli were only about 1% and 20% intelligible, respectively, whereas scores of about 26% and 60%, respectively, were obtained for the ERBn-wide noise bands modulated by the speech bands. These large intelligibility increases occurred despite elimination of spectral fine structure and the addition of stochastic fluctuations to the speech-envelope cues. Results from additional experiments indicate that optimal temporal processing requires that envelope cues stimulate a majority of the fibers comprising an ERBn. [Work supported by NIH.] [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index