Intelligibility prediction for speech mixed with white Gaussian noise at low signal-to-noise ratios

Autor: Simone Graetzer, Carl Hopkins
Rok vydání: 2021
Předmět:
Zdroj: The Journal of the Acoustical Society of America. 149(2)
ISSN: 1520-8524
0001-4966
Popis: The effect of additive white Gaussian noise and high-pass filtering on speech intelligibility at signal-to-noise ratios (SNRs) from −26 to 0 dB was evaluated using British English talkers and normal hearing listeners. SNRs below −10 dB were considered as they are relevant to speech security applications. Eight objective metrics were assessed: short-time objective intelligibility (STOI), a proposed variant termed STOI+, extended short-time objective intelligibility (ESTOI), normalised covariance metric (NCM), normalised subband envelope correlation metric (NSEC), two metrics derived from the coherence speech intelligibility index (CSII), and an envelope-based regression method speech transmission index (STI). For speech and noise mixtures associated with intelligibility scores ranging from 0% to 98%, STOI+ performed at least as well as other metrics and, under some conditions, better than STOI, ESTOI, STI, NSEC, CSIIMid, and CSIIHigh. Both STOI+ and NCM were associated with relatively low prediction error and bias for intelligibility prediction at SNRs from −26 to 0 dB. STI performed least well in terms of correlation with intelligibility scores, prediction error, bias, and reliability. Logistic regression modeling demonstrated that high-pass filtering, which increases the proportion of high to low frequency energy, was detrimental to intelligibility for SNRs between −5 and −17 dB inclusive.
Databáze: OpenAIRE