Complex ISNMF: a Phase-Aware Model for Monaural Audio Source Separation

Autor: Paul Magron, Tuomas Virtanen
Přispěvatelé: Magron, Paul
Jazyk: angličtina
Rok vydání: 2019
Předmět:
FOS: Computer and information sciences
Sound (cs.SD)
Acoustics and Ultrasonics
Computer science
Gaussian
complex NMF
Bayesian inference
Computer Science - Sound
Non-negative matrix factorization
Itakura-Saito divergence
symbols.namesake
Audio and Speech Processing (eess.AS)
Distortion
FOS: Electrical engineering
electronic engineering
information engineering

Computer Science (miscellaneous)
Source separation
Electrical and Electronic Engineering
[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
Markov chain
phase recovery
audio source separation
Statistical model
Computational Mathematics
anisotropic Gaussian model
Fourier transform
symbols
Algorithm
Random variable
Nonnegative matrix factorization (NMF)
Electrical Engineering and Systems Science - Audio and Speech Processing
Popis: This paper introduces a phase-aware probabilistic model for audio source separation. Classical source models in the short-term Fourier transform domain use circularly-symmetric Gaussian or Poisson random variables. This is equivalent to assuming that the phase of each source is uniformly distributed, which is not suitable for exploiting the underlying structure of the phase. Drawing on preliminary works, we introduce here a Bayesian anisotropic Gaussian source model in which the phase is no longer uniform. Such a model permits us to favor a phase value that originates from a signal model through a Markov chain prior structure. The variance of the latent variables are structured with nonnegative matrix factorization (NMF). The resulting model is called complex Itakura-Saito NMF (ISNMF) since it generalizes the ISNMF model to the case of non-isotropic variables. It combines the advantages of ISNMF, which uses a distortion measure adapted to audio and yields a set of estimates which preserve the overall energy of the mixture, and of complex NMF, which enables one to account for some phase constraints. We derive a generalized expectation-maximization algorithm to estimate the model parameters. Experiments conducted on a musical source separation task in a semi-informed setting show that the proposed approach outperforms state-of-the-art phase-aware separation techniques.
Databáze: OpenAIRE