Improving Tokenization Expressiveness With Pitch Intervals

Autor: Kermarec, Mathieu, Bigo, Louis, Keller, Mikaela
Přispěvatelé: Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Machine Learning in Information Networks (MAGNET), Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Algomus, Modélisation, Information et Systèmes - UR UPJV 4290 (MIS), Université de Picardie Jules Verne (UPJV)-Université de Picardie Jules Verne (UPJV)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), The authors are grateful to the Algomus and Magnet teams for fruitful discussions. This work is supported by a special interdisciplinary funding (AIT) from the CRIStAL laboratory and the Merlion PHC Music Language Processing N°48304SM funded by Campus France., Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL], Machine Learning in Information Networks [MAGNET], Bigo, Louis
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Late-Breaking Demo Session
23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Late-Breaking Demo Session, Dec 2022, Bangaluru, India.
Popis: International audience; Training sequence models such as transformers with symbolic music requires a representation of music as sequences of atomic elements called tokens. State-of-the-art music tokenizations encode pitch values explicitly, which complicates the ability of a machine learning model to generalize musical knowledge at different keys. We propose tracks for a tokenization encoding pitch intervals rather than pitch values, resulting in transposition invariant representations. The musical expressiveness of this new tokenization is evaluated through two MIR classification tasks: composer classification and end of phrase detection. We release publicly the code produced in this research.
Databáze: OpenAIRE