Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models

Autor:	Huajian Fang, Timo Gerkmann
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	FOS: Computer and information sciences Sound (cs.SD) Computer Science - Machine Learning Audio and Speech Processing (eess.AS) FOS: Electrical engineering electronic engineering information engineering Computer Science - Sound Machine Learning (cs.LG) Electrical Engineering and Systems Science - Audio and Speech Processing
Popis:	Single-channel deep speech enhancement approaches often estimate a single multiplicative mask to extract clean speech without a measure of its accuracy. Instead, in this work, we propose to quantify the uncertainty associated with clean speech estimates in neural network-based speech enhancement. Predictive uncertainty is typically categorized into aleatoric uncertainty and epistemic uncertainty. The former accounts for the inherent uncertainty in data and the latter corresponds to the model uncertainty. Aiming for robust clean speech estimation and efficient predictive uncertainty quantification, we propose to integrate statistical complex Gaussian mixture models (CGMMs) into a deep speech enhancement framework. More specifically, we model the dependency between input and output stochastically by means of a conditional probability density and train a neural network to map the noisy input to the full posterior distribution of clean speech, modeled as a mixture of multiple complex Gaussian components. Experimental results on different datasets show that the proposed algorithm effectively captures predictive uncertainty and that combining powerful statistical models and deep learning also delivers a superior speech enhancement performance. \copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7ad13857d1b4ce770dea3f058f7aeed0 http://arxiv.org/abs/2212.04831 Zobrazit plný text záznamu