Zobrazeno 1 - 10
of 3 931
pro vyhledávání: '"Fazekas, P."'
In our demo, participants are invited to explore the Diff-MSTC prototype, which integrates the Diff-MST model into Steinberg's digital audio workstation (DAW), Cubase. Diff-MST, a deep learning model for mixing style transfer, forecasts mixing consol
Externí odkaz:
http://arxiv.org/abs/2411.06576
In this paper, we study a series of algorithmic problems related to the subsequences occurring in the strings of a given language, under the assumption that this language is succinctly represented by a grammar generating it, or an automaton accepting
Externí odkaz:
http://arxiv.org/abs/2410.07992
Data augmentation plays a crucial role in addressing the challenge of limited expert-annotated datasets in deep learning applications for retinal Optical Coherence Tomography (OCT) scans. This work exhaustively investigates the impact of various data
Externí odkaz:
http://arxiv.org/abs/2409.13351
This paper presents Tidal-MerzA, a novel system designed for collaborative performances between humans and a machine agent in the context of live coding, specifically focusing on the generation of musical patterns. Tidal-MerzA fuses two foundational
Externí odkaz:
http://arxiv.org/abs/2409.07918
Publikováno v:
2024 IEEE Intelligent Vehicles Symposium (IV), Jeju Island, Korea, Republic of, 2024, pp. 252-257
This paper proposes a control technique for autonomous RC car racing. The presented method does not require any map-building phase beforehand since it operates only local path planning on the actual LiDAR point cloud. Racing control algorithms must h
Externí odkaz:
http://arxiv.org/abs/2408.15152
Autor:
Shatri, Elona, Fazekas, George
Optical Music Recognition (OMR) automates the transcription of musical notation from images into machine-readable formats like MusicXML, MEI, or MIDI, significantly reducing the costs and time of manual transcription. This study explores knowledge di
Externí odkaz:
http://arxiv.org/abs/2408.15002
Autor:
Ma, Yinghao, Øland, Anders, Ragni, Anton, Del Sette, Bleiz MacSen, Saitis, Charalampos, Donahue, Chris, Lin, Chenghua, Plachouras, Christos, Benetos, Emmanouil, Shatri, Elona, Morreale, Fabio, Zhang, Ge, Fazekas, György, Xia, Gus, Zhang, Huan, Manco, Ilaria, Huang, Jiawen, Guinot, Julien, Lin, Liwei, Marinelli, Luca, Lam, Max W. Y., Sharma, Megha, Kong, Qiuqiang, Dannenberg, Roger B., Yuan, Ruibin, Wu, Shangda, Wu, Shih-Lun, Dai, Shuqi, Lei, Shun, Kang, Shiyin, Dixon, Simon, Chen, Wenhu, Huang, Wenhao, Du, Xingjian, Qu, Xingwei, Tan, Xu, Li, Yizhi, Tian, Zeyue, Wu, Zhiyong, Wu, Zhizheng, Ma, Ziyang, Wang, Ziyu
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models
Externí odkaz:
http://arxiv.org/abs/2408.14340
Efficient audio representations in a compressed continuous latent space are critical for generative audio modeling and Music Information Retrieval (MIR) tasks. However, some existing audio autoencoders have limitations, such as multi-stage training p
Externí odkaz:
http://arxiv.org/abs/2408.06500
Autor:
Weck, Benno, Manco, Ilaria, Benetos, Emmanouil, Quinton, Elio, Fazekas, George, Bogdanov, Dmitry
Multimodal models that jointly process audio and language hold great promise in audio understanding and are increasingly being adopted in the music domain. By allowing users to query via text and obtain information about a given audio input, these mo
Externí odkaz:
http://arxiv.org/abs/2408.01337
Despite the success of contrastive learning in Music Information Retrieval, the inherent ambiguity of contrastive self-supervision presents a challenge. Relying solely on augmentation chains and self-supervised positive sampling strategies can lead t
Externí odkaz:
http://arxiv.org/abs/2407.13840