PSFM—A Probabilistic Source Filter Model for Noise Robust Glottal Closure Instant Detection

Autor: Mv, Achuth Rao, Ghosh, Prasanta Kumar
Zdroj: IEEE-ACM Transactions on Audio, Speech, and Language Processing; September 2018, Vol. 26 Issue: 9 p1645-1657, 13p
Abstrakt: Accurate estimation of glottal closure instant (GCI) enables several pitch synchronous speech analysis, such as prosody modifications, glottal inverse filtering, and study of pathological speech. We propose a probabilistic source-filter model (PSFM) for voiced speech, where the source is modeled using the Bernoulli Gaussian distribution, which models the GCI locations and the all-pole filter coefficients are modeled using Gaussian distribution. The probability of GCIs at each speech sample is estimated using the Gibbs sampling. We propose a cost to estimate the exact GCI locations using the N-best dynamic programming. A key feature of the proposed PSFM is that it allows us to include the second-order statistics of the noise for estimating the GCI locations, thereby resulting in a noise robust GCI detection technique, although it has high computational complexity. Evaluation on archivable priority list actual-word database (APLAWD) database shows the proposed algorithm performs at par with the state-of-the-art GCI detection method on clean speech. However, when evaluated in noisy conditions using five types of noises at six different signal-to-noise ratio (SNR) levels, we observe that the proposed method performs better than the best of the existing GCI detection scheme, particularly at low SNR condition indicating the noise robustness of the proposed method.
Databáze: Supplemental Index