Výsledky vyhledávání

Report

Rethinking Image Super-Resolution from Training Data Perspectives

Autor: Ohtani, Go, Tadokoro, Ryu, Yamada, Ryosuke, Asano, Yuki M., Laina, Iro, Rupprecht, Christian, Inoue, Nakamasa, Yokota, Rio, Kataoka, Hirokatsu, Aoki, Yoshimitsu

In this work, we investigate the understudied effect of the training data used for image super-resolution (SR). Most commonly, novel SR methods are developed and benchmarked on common training datasets such as DIV2K and DF2K. However, we investigate

Externí odkaz: http://arxiv.org/abs/2409.00768

Zobrazit plný text záznamu

Report

Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs

Autor: Smart, Brandon, Zheng, Chuanxia, Laina, Iro, Prisacariu, Victor Adrian

In this paper, we introduce Splatt3R, a pose-free, feed-forward method for in-the-wild 3D reconstruction and novel view synthesis from stereo pairs. Given uncalibrated natural images, Splatt3R can predict 3D Gaussian Splats without requiring any came

Externí odkaz: http://arxiv.org/abs/2408.13912

Zobrazit plný text záznamu

Report

3D-Aware Instance Segmentation and Tracking in Egocentric Videos

Autor: Bhalgat, Yash, Tschernezki, Vadim, Laina, Iro, Henriques, João F., Vedaldi, Andrea, Zisserman, Andrew

Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person

Externí odkaz: http://arxiv.org/abs/2408.09860

Zobrazit plný text záznamu

Report

Scaling Backwards: Minimal Synthetic Pre-training?

Autor: Nakamura, Ryo, Tadokoro, Ryu, Yamada, Ryosuke, Asano, Yuki M., Laina, Iro, Rupprecht, Christian, Inoue, Nakamasa, Yokota, Rio, Kataoka, Hirokatsu

Pre-training and transfer learning are an important building block of current computer vision systems. While pre-training is usually performed on large real-world image datasets, in this paper we ask whether this is truly necessary. To this end, we s

Externí odkaz: http://arxiv.org/abs/2408.00677

Zobrazit plný text záznamu

Report

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

Autor: Ma, Xianzheng, Bhalgat, Yash, Smart, Brandon, Chen, Shuai, Li, Xinghui, Ding, Jian, Gu, Jindong, Chen, Dave Zhenyu, Peng, Songyou, Bian, Jia-Wang, Torr, Philip H, Pollefeys, Marc, Nießner, Matthias, Reid, Ian D, Chang, Angel X., Laina, Iro, Prisacariu, Victor Adrian

As large language models (LLMs) evolve, their integration with 3D spatial data (3D-LLMs) has seen rapid progress, offering unprecedented capabilities for understanding and interacting with physical spaces. This survey provides a comprehensive overvie

Externí odkaz: http://arxiv.org/abs/2405.10255

Zobrazit plný text záznamu

Report

Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

Autor: Engstler, Paul, Vedaldi, Andrea, Laina, Iro, Rupprecht, Christian

3D scene generation has quickly become a challenging new research direction, fueled by consistent improvements of 2D generative diffusion models. Most prior work in this area generates scenes by iteratively stitching newly generated frames with exist

Externí odkaz: http://arxiv.org/abs/2404.19758

Zobrazit plný text záznamu

Report

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

Autor: Chen, Minghao, Laina, Iro, Vedaldi, Andrea

We consider the problem of editing 3D objects and scenes based on open-ended language instructions. A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process, obviating the need for 3D data. However, t

Externí odkaz: http://arxiv.org/abs/2404.18929

Zobrazit plný text záznamu

Report

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Autor: Bhalgat, Yash, Laina, Iro, Henriques, João F., Zisserman, Andrew, Vedaldi, Andrea

Understanding complex scenes at multiple levels of abstraction remains a formidable challenge in computer vision. To address this, we introduce Nested Neural Feature Fields (N2F2), a novel approach that employs hierarchical supervision to learn a sin

Externí odkaz: http://arxiv.org/abs/2403.10997

Zobrazit plný text záznamu

Report

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

Autor: Melas-Kyriazi, Luke, Laina, Iro, Rupprecht, Christian, Neverova, Natalia, Vedaldi, Andrea, Gafni, Oran, Kokkinos, Filippos

Most text-to-3D generators build upon off-the-shelf text-to-image models trained on billions of images. They use variants of Score Distillation Sampling (SDS), which is slow, somewhat unstable, and prone to artifacts. A mitigation is to fine-tune the

Externí odkaz: http://arxiv.org/abs/2402.08682

Zobrazit plný text záznamu

Report

SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds

Autor: Chen, Minghao, Xie, Junyu, Laina, Iro, Vedaldi, Andrea

We propose a novel feed-forward 3D editing framework called Shap-Editor. Prior research on editing 3D objects primarily concentrated on editing individual objects by leveraging off-the-shelf 2D image editing networks. This is achieved via a process c

Externí odkaz: http://arxiv.org/abs/2312.09246

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání