Intelligibility prediction for speech mixed with white Gaussian noise at low signal-to-noise ratios
Autor: | Simone Graetzer, Carl Hopkins |
---|---|
Rok vydání: | 2021 |
Předmět: |
Acoustics and Ultrasonics
Speech recognition Speech Intelligibility Reproducibility of Results Covariance Intelligibility (communication) Signal-To-Noise Ratio Correlation Noise symbols.namesake Additive white Gaussian noise Arts and Humanities (miscellaneous) symbols Speech Perception Coherence (signal processing) Perceptual Masking Energy (signal processing) Speech transmission index Mathematics |
Zdroj: | The Journal of the Acoustical Society of America. 149(2) |
ISSN: | 1520-8524 0001-4966 |
Popis: | The effect of additive white Gaussian noise and high-pass filtering on speech intelligibility at signal-to-noise ratios (SNRs) from −26 to 0 dB was evaluated using British English talkers and normal hearing listeners. SNRs below −10 dB were considered as they are relevant to speech security applications. Eight objective metrics were assessed: short-time objective intelligibility (STOI), a proposed variant termed STOI+, extended short-time objective intelligibility (ESTOI), normalised covariance metric (NCM), normalised subband envelope correlation metric (NSEC), two metrics derived from the coherence speech intelligibility index (CSII), and an envelope-based regression method speech transmission index (STI). For speech and noise mixtures associated with intelligibility scores ranging from 0% to 98%, STOI+ performed at least as well as other metrics and, under some conditions, better than STOI, ESTOI, STI, NSEC, CSIIMid, and CSIIHigh. Both STOI+ and NCM were associated with relatively low prediction error and bias for intelligibility prediction at SNRs from −26 to 0 dB. STI performed least well in terms of correlation with intelligibility scores, prediction error, bias, and reliability. Logistic regression modeling demonstrated that high-pass filtering, which increases the proportion of high to low frequency energy, was detrimental to intelligibility for SNRs between −5 and −17 dB inclusive. |
Databáze: | OpenAIRE |
Externí odkaz: |