MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

Autor:	Li, Yizhi, Yuan, Ruibin, Zhang, Ge, Ma, Yinghao, Lin, Chenghua, Chen, Xingran, Ragni, Anton, Yin, Hanzhi, Hu, Zhijie, He, Haoyu, Benetos, Emmanouil, Gyenge, Norbert, Liu, Ruibo, Fu, Jie
Rok vydání:	2022
Předmět:	Computer Science - Sound Computer Science - Artificial Intelligence Computer Science - Machine Learning Computer Science - Multimedia Electrical Engineering and Systems Science - Audio and Speech Processing
Druh dokumentu:	Working Paper
Popis:	The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL). However, it still remains unexplored how to build a framework for learning useful representations of raw music waveforms in a self-supervised manner. In this work, we design Music2Vec, a framework exploring different SSL algorithmic components and tricks for music audio recordings. Our model achieves comparable results to the state-of-the-art (SOTA) music SSL model Jukebox, despite being significantly smaller with less than 2% of parameters of the latter. The model will be released on Huggingface(Please refer to: https://huggingface.co/m-a-p/music2vec-v1)
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2212.02508 Zobrazit plný text záznamu View this record from Arxiv