SRMR variants for improved blind room acoustics characterization

Autor: Senoussaoui, M., Santos, J. F., Falk, T. H.
Rok vydání: 2015
Předmět:
Druh dokumentu: Working Paper
Popis: Reverberation, especially in large rooms, severely degrades speech recognition performance and speech intelligibility. Since direct measurement of room characteristics is usually not possible, blind estimation of reverberation-related metrics such as the reverberation time (RT) and the direct-to-reverberant energy ratio (DRR) can be valuable information to speech recognition and enhancement algorithms operating in enclosed environments. The objective of this work is to evaluate the performance of five variants of blind RT and DRR estimators based on a modulation spectrum representation of reverberant speech with single- and multi-channel speech data. These models are all based on variants of the so-called Speech-to-Reverberation Modulation Energy Ratio (SRMR). We show that these measures outperform a state-of-the-art baseline based on maximum-likelihood estimation of sound decay rates in terms of root-mean square error (RMSE), as well as Pearson correlation. Compared to the baseline, the best proposed measure, called NSRMR_k , achieves a 23% relative improvement in terms of RMSE and allows for relative correlation improvements ranging from 13% to 47% for RT prediction.
Comment: In Proceedings of the ACE Chal- lenge Workshop - a satellite event of IEEE-WASPAA 2015 (arXiv:1510.00383)
Databáze: arXiv