Examining the Perceptual Effect of Alternative Objective Functions for Deep Learning Based Music Source Separation
Autor: | Derry Fitzgerald, Gerald Schuller, Stylianos Ioannis Mimilakis, Konstantinos Drossos, Estefanía Cano |
---|---|
Rok vydání: | 2018 |
Předmět: |
Mean squared error
Artificial neural network Linear programming Computer science business.industry Speech recognition Deep learning 020206 networking & telecommunications 02 engineering and technology Regularization (mathematics) 030507 speech-language pathology & audiology 03 medical and health sciences 0202 electrical engineering electronic engineering information engineering Source separation Artificial intelligence Singing 0305 other medical science business |
Zdroj: | ACSSC |
DOI: | 10.1109/acssc.2018.8645257 |
Popis: | In this study, we examine the effect of various objective functions used to optimize the recently proposed deep learning architecture for singing voice separation MaD - Masker and Denoiser. The parameters of the MaD architecture are optimized using an objective function that contains a reconstruction criterion between predicted and true magnitude spectra of the singing voice, and a regularization term. We examine various reconstruction criteria such as the generalized Kullback-Leibler, mean squared error, and noise to mask ratio. We also explore recently proposed, for optimizing MaD, regularization terms such as sparsity and TwinNetwork regularization. Results from both objective assessment and listening tests suggest that the TwinNetwork regularization results in improved singing voice separation quality. |
Databáze: | OpenAIRE |
Externí odkaz: |