Zobrazeno 1 - 10
of 4 250
pro vyhledávání: '"Benetos A."'
Autor:
Li, Yizhi, Zhang, Ge, Ma, Yinghao, Yuan, Ruibin, Zhu, Kang, Guo, Hangyu, Liang, Yiming, Liu, Jiaheng, Wang, Zekun, Yang, Jian, Wu, Siwei, Qu, Xingwei, Shi, Jinjie, Zhang, Xinyue, Yang, Zhenzhu, Wang, Xiangzhou, Zhang, Zhaoxiang, Liu, Zachary, Benetos, Emmanouil, Huang, Wenhao, Lin, Chenghua
Recent advancements in multimodal large language models (MLLMs) have aimed to integrate and interpret data across diverse modalities. However, the capacity of these models to concurrently process and reason about multiple modalities remains inadequat
Externí odkaz:
http://arxiv.org/abs/2409.15272
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification, where a model must generalize to new classes based on only a few available examples. Extending Prototypical Networks, L
Externí odkaz:
http://arxiv.org/abs/2409.11264
Acoustic identification of individual animals (AIID) is closely related to audio-based species classification but requires a finer level of detail to distinguish between individual animals within the same species. In this work, we frame AIID as a hie
Externí odkaz:
http://arxiv.org/abs/2409.08673
Passive acoustic monitoring (PAM) is crucial for bioacoustic research, enabling non-invasive species tracking and biodiversity monitoring. Citizen science platforms like Xeno-Canto provide large annotated datasets from focal recordings, where the tar
Externí odkaz:
http://arxiv.org/abs/2409.08589
Autor:
Ma, Yinghao, Øland, Anders, Ragni, Anton, Del Sette, Bleiz MacSen, Saitis, Charalampos, Donahue, Chris, Lin, Chenghua, Plachouras, Christos, Benetos, Emmanouil, Shatri, Elona, Morreale, Fabio, Zhang, Ge, Fazekas, György, Xia, Gus, Zhang, Huan, Manco, Ilaria, Huang, Jiawen, Guinot, Julien, Lin, Liwei, Marinelli, Luca, Lam, Max W. Y., Sharma, Megha, Kong, Qiuqiang, Dannenberg, Roger B., Yuan, Ruibin, Wu, Shangda, Wu, Shih-Lun, Dai, Shuqi, Lei, Shun, Kang, Shiyin, Dixon, Simon, Chen, Wenhu, Huang, Wenhao, Du, Xingjian, Qu, Xingwei, Tan, Xu, Li, Yizhi, Tian, Zeyue, Wu, Zhiyong, Wu, Zhizheng, Ma, Ziyang, Wang, Ziyu
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models
Externí odkaz:
http://arxiv.org/abs/2408.14340
Autor:
Weck, Benno, Manco, Ilaria, Benetos, Emmanouil, Quinton, Elio, Fazekas, George, Bogdanov, Dmitry
Multimodal models that jointly process audio and language hold great promise in audio understanding and are increasingly being adopted in the music domain. By allowing users to query via text and obtain information about a given audio input, these mo
Externí odkaz:
http://arxiv.org/abs/2408.01337
Autor:
Zhou, Ziya, Wu, Yuhang, Wu, Zhiyue, Zhang, Xinyue, Yuan, Ruibin, Ma, Yinghao, Wang, Lu, Benetos, Emmanouil, Xue, Wei, Guo, Yike
Symbolic Music, akin to language, can be encoded in discrete symbols. Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant
Externí odkaz:
http://arxiv.org/abs/2407.21531
Autor:
Benetos, Athanase, Fritsch, Coralie, Horton, Emma, Lenotre, Lionel, Toupance, Simon, Villemonais, Denis
Telomeres are repetitive sequences of nucleotides at the end of chromosomes, whose evolution over time is intrinsically related to biological ageing. In most cells, with each cell division, telomeres shorten due to the so-called end replication probl
Externí odkaz:
http://arxiv.org/abs/2407.11453
Multi-instrument music transcription aims to convert polyphonic music recordings into musical scores assigned to each instrument. This task is challenging for modeling as it requires simultaneously identifying multiple instruments and transcribing th
Externí odkaz:
http://arxiv.org/abs/2407.04822
Autor:
Huang, Jiawen, Benetos, Emmanouil
Multilingual automatic lyrics transcription (ALT) is a challenging task due to the limited availability of labelled data and the challenges introduced by singing, compared to multilingual automatic speech recognition. Although some multilingual singi
Externí odkaz:
http://arxiv.org/abs/2406.17618