Does Inharmonicity Improve an NMF-Based Piano Transcription Model?

Autor:	Antoine Falaize, Francois Rigaud, Laurent Daudet, Bertrand David
Přispěvatelé:	Télécom Paristech, Admin, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Traitement du Signal et des Images (TSI), Télécom ParisTech-Centre National de la Recherche Scientifique (CNRS), Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Ecole Superieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), Université Paris sciences et lettres (PSL)
Jazyk:	angličtina
Rok vydání:	2013
Předmět:	[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph] [SPI.ACOU] Engineering Sciences [physics]/Acoustics [physics.class-ph] Computer science Transcription (music) Speech recognition Piano 020206 networking & telecommunications Harmonic (mathematics) piano 02 engineering and technology music transcription Non-negative matrix factorization Sound recording and reproduction non-negative matrix factorization 030507 speech-language pathology & audiology 03 medical and health sciences inharmonicity Inharmonicity 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 0305 other medical science [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
Zdroj:	38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), May 2013, Vancouver, Canada. pp.11-15 ICASSP
Popis:	International audience; This paper investigates how precise a model should be for a robust model-based NMF analysis of piano recordings. While inharmonicityis an essential feature of piano tones from a perceptual point of view, its explicit inclusion in sound models is not straightforward and may even damage the quality of the analysis. Here, we assess the quality of the analysis with a transcription task, and compare three different models for the spectra of the dictionary : one strictly harmonic, one following the theoretical inharmonicity law, and one with relaxed inharmonicity constraints. Experimental results show that both inharmonic models can indeed significantly enhance the results, but only in the case when a good initialization is provided.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::296bfc4bd17a39e83f12869e3c929dda https://hal-imt.archives-ouvertes.fr/hal-00856734 Zobrazit plný text záznamu