Zobrazeno 1 - 10
of 239
pro vyhledávání: '"Benetos, Emmanouil"'
Autor:
Huang, Jiawen, Benetos, Emmanouil
Multilingual automatic lyrics transcription (ALT) is a challenging task due to the limited availability of labelled data and the challenges introduced by singing, compared to multilingual automatic speech recognition. Although some multilingual singi
Externí odkaz:
http://arxiv.org/abs/2406.17618
Autor:
Zhang, Ge, Qu, Scott, Liu, Jiaheng, Zhang, Chenchen, Lin, Chenghua, Yu, Chou Leuang, Pan, Danny, Cheng, Esther, Liu, Jie, Lin, Qunshu, Yuan, Raven, Zheng, Tuney, Pang, Wei, Du, Xinrun, Liang, Yiming, Ma, Yinghao, Li, Yizhi, Ma, Ziyang, Lin, Bill, Benetos, Emmanouil, Yang, Huan, Zhou, Junting, Ma, Kaijing, Liu, Minghao, Niu, Morry, Wang, Noah, Que, Quehry, Liu, Ruibo, Liu, Sine, Guo, Shawn, Gao, Soren, Zhou, Wangchunshu, Zhang, Xinyue, Zhou, Yizhi, Wang, Yubo, Bai, Yuelin, Zhang, Yuhan, Zhang, Yuxiang, Wang, Zenith, Yang, Zhenzhu, Zhao, Zijian, Zhang, Jiajun, Ouyang, Wanli, Huang, Wenhao, Chen, Wenhu
Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind pro
Externí odkaz:
http://arxiv.org/abs/2405.19327
Autor:
Xompero, Alessio, Bontonou, Myriam, Arbona, Jean-Michel, Benetos, Emmanouil, Cavallaro, Andrea
Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image
Externí odkaz:
http://arxiv.org/abs/2405.01646
Autor:
Deng, Qixin, Yang, Qikai, Yuan, Ruibin, Huang, Yipeng, Wang, Yi, Liu, Xubo, Tian, Zeyue, Pan, Jiahao, Zhang, Ge, Lin, Hanfeng, Li, Yizhi, Ma, Yinghao, Fu, Jie, Lin, Chenghua, Benetos, Emmanouil, Wang, Wenwu, Xia, Guangyu, Xue, Wei, Guo, Yike
Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM
Externí odkaz:
http://arxiv.org/abs/2404.18081
Autor:
Qu, Xingwei, Bai, Yuelin, Ma, Yinghao, Zhou, Ziya, Lo, Ka Man, Liu, Jiaheng, Yuan, Ruibin, Min, Lejun, Liu, Xueling, Zhang, Tianyu, Du, Xinrun, Guo, Shuyue, Liang, Yiming, Li, Yizhi, Wu, Shangda, Zhou, Junting, Zheng, Tianyu, Ma, Ziyang, Han, Fengze, Xue, Wei, Xia, Gus, Benetos, Emmanouil, Yue, Xiang, Lin, Chenghua, Tan, Xu, Huang, Stephen W., Chen, Wenhu, Fu, Jie, Zhang, Ge
In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Nota
Externí odkaz:
http://arxiv.org/abs/2404.06393
Detecting the presence of animal vocalisations in nature is essential to study animal populations and their behaviors. A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims t
Externí odkaz:
http://arxiv.org/abs/2403.18638
Multi-Source Diffusion Models (MSDM) allow for compositional musical generation tasks: generating a set of coherent sources, creating accompaniments, and performing source separation. Despite their versatility, they require estimating the joint distr
Externí odkaz:
http://arxiv.org/abs/2403.11706
Autor:
Liang, Jinhua, Zhang, Huan, Liu, Haohe, Cao, Yin, Kong, Qiuqiang, Liu, Xubo, Wang, Wenwu, Plumbley, Mark D., Phan, Huy, Benetos, Emmanouil
We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing. Specifically, WavCraft describes the content of raw audio materials in natural lang
Externí odkaz:
http://arxiv.org/abs/2403.09527
Autor:
Yuan, Ruibin, Lin, Hanfeng, Wang, Yi, Tian, Zeyue, Wu, Shangda, Shen, Tianhao, Zhang, Ge, Wu, Yuhang, Liu, Cong, Zhou, Ziya, Ma, Ziyang, Xue, Liumeng, Wang, Ziyu, Liu, Qin, Zheng, Tianyu, Li, Yizhi, Ma, Yinghao, Liang, Yiming, Chi, Xiaowei, Liu, Ruibo, Wang, Zili, Li, Pengfei, Wu, Jingcheng, Lin, Chenghua, Liu, Qifeng, Jiang, Tao, Huang, Wenhao, Chen, Wenhu, Benetos, Emmanouil, Fu, Jie, Xia, Gus, Dannenberg, Roger, Xue, Wei, Kang, Shiyin, Guo, Yike
While Large Language Models (LLMs) demonstrate impressive capabilities in text generation, we find that their ability has yet to be generalized to music, humanity's creative language. We introduce ChatMusician, an open-source LLM that integrates intr
Externí odkaz:
http://arxiv.org/abs/2402.16153
Algorithms for automatic piano transcription have improved dramatically in recent years due to new datasets and modeling techniques. Recent developments have focused primarily on adapting new neural network architectures, such as the Transformer and
Externí odkaz:
http://arxiv.org/abs/2402.01424