Improved Signal-to-Noise Ratio Estimation for Speech Enhancement
Autor: | C. Plapous, Pascal Scalart, Claude Marro |
---|---|
Přispěvatelé: | Orange Labs [Lannion], France Télécom, Reconfigurable and Retargetable Digital Devices (R2D2), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-INRIA Rennes, Institut National de Recherche en Informatique et en Automatique (Inria)-École Nationale Supérieure des Sciences Appliquées et de Technologie (ENSSAT), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-INRIA Rennes |
Jazyk: | angličtina |
Rok vydání: | 2006 |
Předmět: |
Reverberation
Acoustics and Ultrasonics Computer science Noise reduction Speech recognition 020206 networking & telecommunications 02 engineering and technology Speech processing 01 natural sciences Speech enhancement Background noise Noise Signal-to-noise ratio [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Harmonic Electrical and Electronic Engineering 010301 acoustics [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing |
Zdroj: | IEEE Transactions on Audio, Speech and Language Processing IEEE Transactions on Audio, Speech and Language Processing, 2006 IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2006 |
ISSN: | 1558-7916 |
Popis: | This paper addresses the problem of single-microphone speech enhancement in noisy environments. State-of-the-art short-time noise reduction techniques are most often expressed as a spectral gain depending on the signal-to-noise ratio (SNR). The well-known decision-directed (DD) approach drastically limits the level of musical noise, but the estimated a priori SNR is biased since it depends on the speech spectrum estimation in the previous frame. Therefore, the gain function matches the previous frame rather than the current one which degrades the noise reduction performance. The consequence of this bias is an annoying reverberation effect. We propose a method called two-step noise reduction (TSNR) technique which solves this problem while maintaining the benefits of the decision-directed approach. The estimation of the a priori SNR is refined by a second step to remove the bias of the DD approach, thus removing the reverberation effect. However, classic short-time noise reduction techniques, including TSNR, introduce harmonic distortion in enhanced speech because of the unreliability of estimators for small signal-to-noise ratios. This is mainly due to the difficult task of noise power spectrum density (PSD) estimation in single-microphone schemes. To overcome this problem, we propose a method called harmonic regeneration noise reduction (HRNR). A nonlinearity is used to regenerate the degraded harmonics of the distorted signal in an efficient way. The resulting artificial signal is produced in order to refine the a priori SNR used to compute a spectral gain able to preserve the speech harmonics. These methods are analyzed and objective and formal subjective test results between HRNR and TSNR techniques are provided. A significant improvement is brought by HRNR compared to TSNR thanks to the preservation of harmonics |
Databáze: | OpenAIRE |
Externí odkaz: |