Does Inharmonicity Improve an NMF-Based Piano Transcription Model?
Autor: | Antoine Falaize, Francois Rigaud, Laurent Daudet, Bertrand David |
---|---|
Přispěvatelé: | Télécom Paristech, Admin, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Traitement du Signal et des Images (TSI), Télécom ParisTech-Centre National de la Recherche Scientifique (CNRS), Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Ecole Superieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), Université Paris sciences et lettres (PSL) |
Jazyk: | angličtina |
Rok vydání: | 2013 |
Předmět: |
[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph]
[SPI.ACOU] Engineering Sciences [physics]/Acoustics [physics.class-ph] Computer science Transcription (music) Speech recognition Piano 020206 networking & telecommunications Harmonic (mathematics) piano 02 engineering and technology music transcription Non-negative matrix factorization Sound recording and reproduction non-negative matrix factorization 030507 speech-language pathology & audiology 03 medical and health sciences inharmonicity Inharmonicity 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 0305 other medical science [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing |
Zdroj: | 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) 38th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), May 2013, Vancouver, Canada. pp.11-15 ICASSP |
Popis: | International audience; This paper investigates how precise a model should be for a robust model-based NMF analysis of piano recordings. While inharmonicityis an essential feature of piano tones from a perceptual point of view, its explicit inclusion in sound models is not straightforward and may even damage the quality of the analysis. Here, we assess the quality of the analysis with a transcription task, and compare three different models for the spectra of the dictionary : one strictly harmonic, one following the theoretical inharmonicity law, and one with relaxed inharmonicity constraints. Experimental results show that both inharmonic models can indeed significantly enhance the results, but only in the case when a good initialization is provided. |
Databáze: | OpenAIRE |
Externí odkaz: |