Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation
Autor: | Alexey Ozerov, Cédric Févotte, Raphaël Blouet, Jean-Louis Durrieu |
---|---|
Přispěvatelé: | Speech and sound data modeling and processing (METISS), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire Traitement et Communication de l'Information (LTCI), Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), Yacast, Laboratoire de Traitement du signal [EPFL] / Signal Processing Laboratories (SP Lab), Ecole Polytechnique Fédérale de Lausanne (EPFL), Quaero program funded by OSEO, the French State agency for innovation, ANR-06-RIAM-0024,SARAH,StAndardisation du Remastering Audio Haute-définition(2006), ANR-09-JCJC-0073,TANGERINE(2009), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Ozerov, Alexey, Programme Audiovisuel et Multimédia - StAndardisation du Remastering Audio Haute-définition - - SARAH2006 - ANR-06-RIAM-0024 - RIAM - VALID, Jeunes chercheuses et jeunes chercheurs - - TANGERINE2009 - ANR-09-JCJC-0073 - JCJC - VALID |
Jazyk: | angličtina |
Rok vydání: | 2011 |
Předmět: |
Computer science
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing Speech recognition 02 engineering and technology computer.software_genre Convolution 030507 speech-language pathology & audiology 03 medical and health sciences [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing 0202 electrical engineering electronic engineering information engineering Source separation Tensor Audio signal processing [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing business.industry SIGNAL (programming language) 020206 networking & telecommunications Pattern recognition Time–frequency analysis Spectrogram Noise (video) Artificial intelligence 0305 other medical science business computer [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing |
Zdroj: | IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'11) IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'11), May 2011, Prague, Czech Republic ICASSP |
Popis: | International audience; Separating multiple tracks from professionally produced music recordings (PPMRs) is still a challenging problem. We address this task with a user-guided approach in which the separation system is provided segmental information indicating the time activations of the particular instruments to separate. This information may typically be retrieved from manual annotation. We use a so-called multichannel nonnegative tensor factorization (NTF) model, in which the original sources are observed through a multichannel convolutive mixture and in which the source power spectrograms are jointly modeled by a 3-valence (time/frequency/source) tensor. Our user-guided separation method produced competitive results at the 2010 Signal Separation Evaluation Campaign, with sufficient quality for real-world music editing applications. |
Databáze: | OpenAIRE |
Externí odkaz: |