Zobrazeno 1 - 10
of 84
pro vyhledávání: '"GAO Ruohan"'
Autor:
Falcon-Perez, Ricardo, Gao, Ruohan, Mueckl, Gregor, Gari, Sebastia V. Amengual, Ananthabhotla, Ishwarya
The task of Novel View Acoustic Synthesis (NVAS) - generating Room Impulse Responses (RIRs) for unseen source and receiver positions in a scene - has recently gained traction, especially given its relevance to Augmented Reality (AR) and Virtual Reali
Externí odkaz:
http://arxiv.org/abs/2410.23523
Accurately estimating and simulating the physical properties of objects from real-world sound recordings is of great practical importance in the fields of vision, graphics, and robotics. However, the progress in these directions has been limited -- p
Externí odkaz:
http://arxiv.org/abs/2409.13486
Autor:
Yun, Heeseung, Gao, Ruohan, Ananthabhotla, Ishwarya, Kumar, Anurag, Donley, Jacob, Li, Chao, Kim, Gunhee, Ithapu, Vamsi Krishna, Murdock, Calvin
Egocentric videos provide comprehensive contexts for user and scene understanding, spanning multisensory perception to behavioral interaction. We propose Spherical World-Locking (SWL) as a general framework for egocentric scene representation, which
Externí odkaz:
http://arxiv.org/abs/2408.05364
Autor:
Chowdhury, Sanjoy, Nag, Sayan, Dasgupta, Subhrajyoti, Chen, Jun, Elhoseiny, Mohamed, Gao, Ruohan, Manocha, Dinesh
Leveraging Large Language Models' remarkable proficiency in text-based tasks, recent works on Multi-modal LLMs (MLLMs) extend them to other modalities like vision and audio. However, the progress in these directions has been mostly focused on tasks t
Externí odkaz:
http://arxiv.org/abs/2407.01851
Recent years have seen immense progress in 3D computer vision and computer graphics, with emerging tools that can virtualize real-world 3D environments for numerous Mixed Reality (XR) applications. However, alongside immersive visual experiences, imm
Externí odkaz:
http://arxiv.org/abs/2406.07532
Autor:
Jia, Wenqi, Liu, Miao, Jiang, Hao, Ananthabhotla, Ishwarya, Rehg, James M., Ithapu, Vamsi Krishna, Gao, Ruohan
In recent years, the thriving development of research related to egocentric videos has provided a unique perspective for the study of conversational interactions, where both visual and audio signals play a crucial role. While most prior work focus on
Externí odkaz:
http://arxiv.org/abs/2312.12870
A room's acoustic properties are a product of the room's geometry, the objects within the room, and their specific positions. A room's acoustic properties can be characterized by its impulse response (RIR) between a source and listener location, or r
Externí odkaz:
http://arxiv.org/abs/2311.03517
Autor:
Zhang, Ruohan, Lee, Sharon, Hwang, Minjune, Hiranaka, Ayano, Wang, Chen, Ai, Wensi, Tan, Jin Jie Ryan, Gupta, Shreya, Hao, Yilun, Levine, Gabrael, Gao, Ruohan, Norcia, Anthony, Fei-Fei, Li, Wu, Jiajun
We present Neural Signal Operated Intelligent Robots (NOIR), a general-purpose, intelligent brain-robot interface system that enables humans to command robots to perform everyday activities through brain signals. Through this interface, humans commun
Externí odkaz:
http://arxiv.org/abs/2311.01454
Autor:
Clarke, Samuel, Gao, Ruohan, Wang, Mason, Rau, Mark, Xu, Julia, Wang, Jui-Hsien, James, Doug L., Wu, Jiajun
Objects make unique sounds under different perturbations, environment conditions, and poses relative to the listener. While prior works have modeled impact sounds and sound propagation in simulation, we lack a standard dataset of impact sound fields
Externí odkaz:
http://arxiv.org/abs/2306.09944
Autor:
Gao, Ruohan, Dou, Yiming, Li, Hao, Agarwal, Tanmay, Bohg, Jeannette, Li, Yunzhu, Fei-Fei, Li, Wu, Jiajun
We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch. We also introduce the ObjectFolder Rea
Externí odkaz:
http://arxiv.org/abs/2306.00956