Výsledky vyhledávání

Report

3D Congealing: 3D-Aware Image Alignment in the Wild

Autor: Zhang, Yunzhi, Li, Zizhang, Raj, Amit, Engelhardt, Andreas, Li, Yuanzhen, Hou, Tingbo, Wu, Jiajun, Jampani, Varun

We propose 3D Congealing, a novel problem of 3D-aware alignment for 2D images capturing semantically similar objects. Given a collection of unlabeled Internet images, our goal is to associate the shared semantic parts from the inputs and aggregate th

Externí odkaz: http://arxiv.org/abs/2404.02125

Zobrazit plný text záznamu

Report

Learning the 3D Fauna of the Web

Autor: Li, Zizhang, Litvak, Dor, Li, Ruining, Zhang, Yunzhi, Jakab, Tomas, Rupprecht, Christian, Wu, Shangzhe, Vedaldi, Andrea, Wu, Jiajun

Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species

Externí odkaz: http://arxiv.org/abs/2401.02400

Zobrazit plný text záznamu

Report

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

Autor: Sargent, Kyle, Li, Zizhang, Shah, Tanmay, Herrmann, Charles, Yu, Hong-Xing, Zhang, Yunzhi, Chan, Eric Ryan, Lagun, Dmitry, Fei-Fei, Li, Sun, Deqing, Wu, Jiajun

We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view synthesis for in-the-wild scenes. While existing methods are designed for single objects with masked backgrounds, we propose new techniques to address challenges introduced

Externí odkaz: http://arxiv.org/abs/2310.17994

Zobrazit plný text záznamu

Report

Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation

Autor: Lyu, Xiaoyang, Dai, Peng, Li, Zizhang, Yan, Dongyu, Lin, Yi, Peng, Yifan, Qi, Xiaojuan

Implicit neural rendering, which uses signed distance function (SDF) representation with geometric priors (such as depth or surface normal), has led to impressive progress in the surface reconstruction of large-scale scenes. However, applying this me

Externí odkaz: http://arxiv.org/abs/2303.09152

Zobrazit plný text záznamu

Report

RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction

Autor: Li, Zizhang, Lyu, Xiaoyang, Ding, Yuanyuan, Wang, Mengmeng, Liao, Yiyi, Liu, Yong

Recently, neural implicit surfaces have become popular for multi-view reconstruction. To facilitate practical applications like scene editing and manipulation, some works extend the framework with semantic masks input for the object-compositional rec

Externí odkaz: http://arxiv.org/abs/2303.08605

Zobrazit plný text záznamu

Report

Failure-aware Policy Learning for Self-assessable Robotics Tasks

Autor: Xu, Kechun, Chen, Runjian, Zhao, Shuqi, Li, Zizhang, Yu, Hongxiang, Chen, Ci, Wang, Yue, Xiong, Rong

Self-assessment rules play an essential role in safe and effective real-world robotic applications, which verify the feasibility of the selected action before actual execution. But how to utilize the self-assessment results to re-choose actions remai

Externí odkaz: http://arxiv.org/abs/2302.13024

Zobrazit plný text záznamu

Report

A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter

Autor: Xu, Kechun, Zhao, Shuqi, Zhou, Zhongxiang, Li, Zizhang, Pi, Huaijin, Zhu, Yifeng, Wang, Yue, Xiong, Rong

We focus on the task of language-conditioned grasping in clutter, in which a robot is supposed to grasp the target object based on a language instruction. Previous works separately conduct visual grounding to localize the target object, and generate

Externí odkaz: http://arxiv.org/abs/2302.12610

Zobrazit plný text záznamu

Report

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

Autor: Li, Zizhang, Wang, Mengmeng, Pi, Huaijin, Xu, Kechun, Mei, Jianbiao, Liu, Yong

Recently, the image-wise implicit neural representation of videos, NeRV, has gained popularity for its promising results and swift speed compared to regular pixel-wise implicit representations. However, the redundant parameters within the network str

Externí odkaz: http://arxiv.org/abs/2207.08132

Zobrazit plný text záznamu

Report

Searching Parameterized AP Loss for Object Detection

Autor: Tao, Chenxin, Li, Zizhang, Zhu, Xizhou, Huang, Gao, Liu, Yong, Dai, Jifeng

Loss functions play an important role in training deep-network-based object detectors. The most widely used evaluation metric for object detection is Average Precision (AP), which captures the performance of localization and classification sub-tasks

Externí odkaz: http://arxiv.org/abs/2112.05138

Zobrazit plný text záznamu

Report

MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation

Autor: Li, Zizhang, Wang, Mengmeng, Mei, Jianbiao, Liu, Yong

Referring image segmentation is a typical multi-modal task, which aims at generating a binary mask for referent described in given language expressions. Prior arts adopt a bimodal solution, taking images and languages as two modalities within an enco

Externí odkaz: http://arxiv.org/abs/2111.10747

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání