Zobrazeno 1 - 10
of 485
pro vyhledávání: '"Bello, Juan"'
Self-supervised learning (SSL) offers a powerful way to learn robust, generalizable representations without labeled data. In music, where labeled data is scarce, existing SSL methods typically use generated supervision and multi-view redundancy to cr
Externí odkaz:
http://arxiv.org/abs/2411.02711
Autor:
Castelo, Sonia, Rulff, Joao, Solunke, Parikshit, McGowan, Erin, Wu, Guande, Roman, Iran, Lopez, Roque, Steers, Bea, Sun, Qi, Bello, Juan, Feest, Bradley, Middleton, Michael, Mckendrick, Ryan, Silva, Claudio
The concept of an intelligent augmented reality (AR) assistant has significant, wide-ranging applications, with potential uses in medicine, military, and mechanics domains. Such an assistant must be able to perceive the environment and actions, reaso
Externí odkaz:
http://arxiv.org/abs/2407.12260
Autor:
Roman, Iran R., Ick, Christopher, Ding, Sivan, Roman, Adrian S., McFee, Brian, Bello, Juan P.
Sound event localization and detection (SELD) is an important task in machine listening. Major advancements rely on simulated data with sound events in specific rooms and strong spatio-temporal labels. SELD data is simulated by convolving spatialy-lo
Externí odkaz:
http://arxiv.org/abs/2401.12238
Self-supervised monocular depth estimation (DE) is an approach to learning depth without costly depth ground truths. However, it often struggles with moving objects that violate the static scene assumption during training. To address this issue, we i
Externí odkaz:
http://arxiv.org/abs/2312.10118
Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times,
Externí odkaz:
http://arxiv.org/abs/2312.08136
In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems. For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative
Externí odkaz:
http://arxiv.org/abs/2312.08071
Autor:
Wilkins, Julia, Fuentes, Magdalena, Bondi, Luca, Ghaffarzadegan, Shabnam, Abavisani, Ali, Bello, Juan Pablo
Sound event localization and detection (SELD) systems estimate both the direction-of-arrival (DOA) and class of sound sources over time. In the DCASE 2022 SELD Challenge (Task 3), models are designed to operate in a 4-channel setting. While beneficia
Externí odkaz:
http://arxiv.org/abs/2309.13343
Localizing a moving sound source in the real world involves determining its direction-of-arrival (DOA) and distance relative to a microphone. Advancements in DOA estimation have been facilitated by data-driven methods optimized with large open-source
Externí odkaz:
http://arxiv.org/abs/2309.09288
Finding the right sound effects (SFX) to match moments in a video is a difficult and time-consuming task, and relies heavily on the quality and completeness of text metadata. Retrieving high-quality (HQ) SFX using a video frame directly as the query
Externí odkaz:
http://arxiv.org/abs/2308.09089
Autor:
Castelo, Sonia, Rulff, Joao, McGowan, Erin, Steers, Bea, Wu, Guande, Chen, Shaoyu, Roman, Iran, Lopez, Roque, Brewer, Ethan, Zhao, Chen, Qian, Jing, Cho, Kyunghyun, He, He, Sun, Qi, Vo, Huy, Bello, Juan, Krone, Michael, Silva, Claudio
The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneous
Externí odkaz:
http://arxiv.org/abs/2308.06246