Discriminative and Dynamic Nonnegative Matrix Factorization on Monaural Audio Source Separation
Autor: | Huang, Yi-Chun, 黃奕鈞 |
---|---|
Rok vydání: | 2016 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 105 The nonnegative matrix factorization (NMF), which learns dictionaries from source spectra and uses the learned dictionaries to decompose the mixture in the test phase, is a widely used tool for audio source separation. However, the standard NMF does not consider temporal properties of the signals when learning dictionaries. The standard NMF is also a generative model, which do not guarantee that a good representation model is also a good separation model. Besides, the learned dictionaries should be partitioned into subgroups to account for sources with different spectro-temporal properties, such as speech signals from different speakers or music signals from different instruments. Therefore, we propose a method by combine extensions of NMF to address these problems for speech denoising and singing voice separation. For temporal modeling, our method adopts a post-filtering technique, which derives a source specific vector autoregressive (VAR) model to smooth the NMF coefficients in the test phase. For partitioning, we make use of the mixture of local dictionaries (MLD) technique to divide dictionaries into subgroups by considering intra- and inter- group distances. We also introduce a modified discriminative learning procedure to deal with the representation-separation problem. To sum up, our NMF-extended method put additional considerations on the temporal properties of each subgroup and discrimination between sources. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |