Zobrazeno 1 - 10
of 33
pro vyhledávání: '"Bhalgat, Yash"'
Autor:
Dhiman, Ankit, Shah, Manan, Parihar, Rishubh, Bhalgat, Yash, Boregowda, Lokesh R, Babu, R Venkatesh
We tackle the problem of generating highly realistic and plausible mirror reflections using diffusion-based generative models. We formulate this problem as an image inpainting task, allowing for more user control over the placement of mirrors during
Externí odkaz:
http://arxiv.org/abs/2409.14677
Autor:
Liu, Changkun, Chen, Shuai, Bhalgat, Yash, Hu, Siyan, Cheng, Ming, Wang, Zirui, Prisacariu, Victor Adrian, Braud, Tristan
We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement framework, GSLoc. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordin
Externí odkaz:
http://arxiv.org/abs/2408.11085
Autor:
Bhalgat, Yash, Tschernezki, Vadim, Laina, Iro, Henriques, João F., Vedaldi, Andrea, Zisserman, Andrew
Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person
Externí odkaz:
http://arxiv.org/abs/2408.09860
Autor:
Shah, Manan, Bhalgat, Yash
This report is a reproducibility study of the paper "CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification" (Abdelfattah et al, ICCV 2023). Our report makes the following contributions: (1) We provide a reproducible, well comme
Externí odkaz:
http://arxiv.org/abs/2405.11574
Autor:
Ma, Xianzheng, Bhalgat, Yash, Smart, Brandon, Chen, Shuai, Li, Xinghui, Ding, Jian, Gu, Jindong, Chen, Dave Zhenyu, Peng, Songyou, Bian, Jia-Wang, Torr, Philip H, Pollefeys, Marc, Nießner, Matthias, Reid, Ian D, Chang, Angel X., Laina, Iro, Prisacariu, Victor Adrian
As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs) has seen rapid progress, offering unprecedented capabilities for understanding and interacting with physical spaces. This survey provides a comprehensive overvie
Externí odkaz:
http://arxiv.org/abs/2405.10255
Understanding complex scenes at multiple levels of abstraction remains a formidable challenge in computer vision. To address this, we introduce Nested Neural Feature Fields (N2F2), a novel approach that employs hierarchical supervision to learn a sin
Externí odkaz:
http://arxiv.org/abs/2403.10997
Autor:
Tao, Yifu, Bhalgat, Yash, Fu, Lanke Frank Tarimo, Mattamala, Matias, Chebrolu, Nived, Fallon, Maurice
We present a neural-field-based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photo-realistic textures. This system adapts the state-of-the-art
Externí odkaz:
http://arxiv.org/abs/2403.06877
Instance segmentation in 3D is a challenging task due to the lack of large-scale annotated datasets. In this paper, we show that this task can be addressed effectively by leveraging instead 2D pre-trained models for instance segmentation. We propose
Externí odkaz:
http://arxiv.org/abs/2306.04633
Autor:
Chen, Shuai, Bhalgat, Yash, Li, Xinghui, Bian, Jiawang, Li, Kejie, Wang, Zirui, Prisacariu, Victor Adrian
Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images. However, the predominant APR architectures only rely on 2D operations during inference, resulting in limited accuracy of pose estimation
Externí odkaz:
http://arxiv.org/abs/2303.10087
Transformers are powerful visual learners, in large part due to their conspicuous lack of manually-specified priors. This flexibility can be problematic in tasks that involve multiple-view geometry, due to the near-infinite possible variations in 3D
Externí odkaz:
http://arxiv.org/abs/2211.15107