Zobrazeno 1 - 10
of 59
pro vyhledávání: '"Zheng, Chuanxia"'
In this paper, we introduce Splatt3R, a pose-free, feed-forward method for in-the-wild 3D reconstruction and novel view synthesis from stereo pairs. Given uncalibrated natural images, Splatt3R can predict 3D Gaussian Splats without requiring any came
Externí odkaz:
http://arxiv.org/abs/2408.13912
We present Puppet-Master, an interactive video generative model that can serve as a motion prior for part-level dynamics. At test time, given a single image and a sparse set of motion trajectories (i.e., drags), Puppet-Master can synthesize a video d
Externí odkaz:
http://arxiv.org/abs/2408.04631
Autor:
Szymanowicz, Stanislaw, Insafutdinov, Eldar, Zheng, Chuanxia, Campbell, Dylan, Henriques, João F., Rupprecht, Christian, Vedaldi, Andrea
In this paper, we propose Flash3D, a method for scene reconstruction and novel view synthesis from a single image which is both very generalisable and efficient. For generalisability, we start from a "foundation" model for monocular depth estimation
Externí odkaz:
http://arxiv.org/abs/2406.04343
We introduce DragAPart, a method that, given an image and a set of drags as input, generates a new image of the same object that responds to the action of the drags. Differently from prior works that focused on repositioning objects, DragAPart predic
Externí odkaz:
http://arxiv.org/abs/2403.15382
Autor:
Chen, Yuedong, Xu, Haofei, Zheng, Chuanxia, Zhuang, Bohan, Pollefeys, Marc, Geiger, Andreas, Cham, Tat-Jen, Cai, Jianfei
We introduce MVSplat, an efficient model that, given sparse multi-view images as input, predicts clean feed-forward 3D Gaussians. To accurately localize the Gaussian centers, we build a cost volume representation via plane sweeping, where the cross-v
Externí odkaz:
http://arxiv.org/abs/2403.14627
3D decomposition/segmentation still remains a challenge as large-scale 3D annotated data is not readily available. Contemporary approaches typically leverage 2D machine-generated segments, integrating them for 3D consistency. While the majority of th
Externí odkaz:
http://arxiv.org/abs/2403.14619
This paper studies amodal image segmentation: predicting entire object segmentation masks including both visible and invisible (occluded) parts. In previous work, the amodal segmentation ground truth on real images is usually predicted by manual anno
Externí odkaz:
http://arxiv.org/abs/2312.17247
Autor:
Zheng, Chuanxia, Vedaldi, Andrea
We introduce Free3D, a simple accurate method for monocular open-set novel view synthesis (NVS). Similar to Zero-1-to-3, we start from a pre-trained 2D image generator for generalization, and fine-tune it for NVS. Compared to other works that took a
Externí odkaz:
http://arxiv.org/abs/2312.04551
Our objective in this paper is to probe large vision models to determine to what extent they 'understand' different physical properties of the 3D scene depicted in an image. To this end, we make the following contributions: (i) We introduce a general
Externí odkaz:
http://arxiv.org/abs/2310.06836
Autor:
Zheng, Chuanxia, Vedaldi, Andrea
Vector Quantisation (VQ) is experiencing a comeback in machine learning, where it is increasingly used in representation learning. However, optimizing the codevectors in existing VQ-VAE is not entirely trivial. A problem is codebook collapse, where o
Externí odkaz:
http://arxiv.org/abs/2307.15139