Výsledky vyhledávání

Report

Self-Supervised Multi-View Learning for Disentangled Music Audio Representations

Autor: Wilkins, Julia, Ding, Sivan, Fuentes, Magdalena, Bello, Juan Pablo

Self-supervised learning (SSL) offers a powerful way to learn robust, generalizable representations without labeled data. In music, where labeled data is scarce, existing SSL methods typically use generated supervision and multi-view redundancy to cr

Externí odkaz: http://arxiv.org/abs/2411.02711

Zobrazit plný text záznamu

Report

HuBar: A Visual Analytics Tool to Explore Human Behaviour based on fNIRS in AR guidance systems

Autor: Castelo, Sonia, Rulff, Joao, Solunke, Parikshit, McGowan, Erin, Wu, Guande, Roman, Iran, Lopez, Roque, Steers, Bea, Sun, Qi, Bello, Juan, Feest, Bradley, Middleton, Michael, Mckendrick, Ryan, Silva, Claudio

The concept of an intelligent augmented reality (AR) assistant has significant, wide-ranging applications, with potential uses in medicine, military, and mechanics domains. Such an assistant must be able to perceive the environment and actions, reaso

Externí odkaz: http://arxiv.org/abs/2407.12260

Zobrazit plný text záznamu

Report

Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms

Autor: Roman, Iran R., Ick, Christopher, Ding, Sivan, Roman, Adrian S., McFee, Brian, Bello, Juan P.

Sound event localization and detection (SELD) is an important task in machine listening. Major advancements rely on simulated data with sound events in specific rooms and strong spatio-temporal labels. SELD data is simulated by convolving spatialy-lo

Externí odkaz: http://arxiv.org/abs/2401.12238

Zobrazit plný text záznamu

Report

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Autor: Moon, Jaeho, Bello, Juan Luis Gonzalez, Kwon, Byeongjun, Kim, Munchurl

Self-supervised monocular depth estimation (DE) is an approach to learning depth without costly depth ground truths. However, it often struggles with moving objects that violate the static scene assumption during training. To address this issue, we i

Externí odkaz: http://arxiv.org/abs/2312.10118

Zobrazit plný text záznamu

Report

ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields

Autor: Bello, Juan Luis Gonzalez, Bui, Minh-Quan Viet, Kim, Munchurl

Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times,

Externí odkaz: http://arxiv.org/abs/2312.08136

Zobrazit plný text záznamu

Report

Novel View Synthesis with View-Dependent Effects from a Single Image

Autor: Bello, Juan Luis Gonzalez, Kim, Munchurl

In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems. For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative

Externí odkaz: http://arxiv.org/abs/2312.08071

Zobrazit plný text záznamu

Report

Two vs. Four-Channel Sound Event Localization and Detection

Autor: Wilkins, Julia, Fuentes, Magdalena, Bondi, Luca, Ghaffarzadegan, Shabnam, Abavisani, Ali, Bello, Juan Pablo

Sound event localization and detection (SELD) systems estimate both the direction-of-arrival (DOA) and class of sound sources over time. In the DCASE 2022 SELD Challenge (Task 3), models are designed to operate in a 4-channel setting. While beneficia

Externí odkaz: http://arxiv.org/abs/2309.13343

Zobrazit plný text záznamu

Report

Sound Source Distance Estimation in Diverse and Dynamic Acoustic Conditions

Autor: Kushwaha, Saksham Singh, Roman, Iran R., Fuentes, Magdalena, Bello, Juan Pablo

Localizing a moving sound source in the real world involves determining its direction-of-arrival (DOA) and distance relative to a microphone. Advancements in DOA estimation have been facilitated by data-driven methods optimized with large open-source

Externí odkaz: http://arxiv.org/abs/2309.09288

Zobrazit plný text záznamu

Report

Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

Autor: Wilkins, Julia, Salamon, Justin, Fuentes, Magdalena, Bello, Juan Pablo, Nieto, Oriol

Finding the right sound effects (SFX) to match moments in a video is a difficult and time-consuming task, and relies heavily on the quality and completeness of text metadata. Retrieving high-quality (HQ) SFX using a video frame directly as the query

Externí odkaz: http://arxiv.org/abs/2308.09089

Zobrazit plný text záznamu

Report

ARGUS: Visualization of AI-Assisted Task Guidance in AR

Autor: Castelo, Sonia, Rulff, Joao, McGowan, Erin, Steers, Bea, Wu, Guande, Chen, Shaoyu, Roman, Iran, Lopez, Roque, Brewer, Ethan, Zhao, Chen, Qian, Jing, Cho, Kyunghyun, He, He, Sun, Qi, Vo, Huy, Bello, Juan, Krone, Michael, Silva, Claudio

The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneous

Externí odkaz: http://arxiv.org/abs/2308.06246

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání