Výsledky vyhledávání

Report

SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

Autor: Xie, Yiming, Yao, Chun-Han, Voleti, Vikram, Jiang, Huaizu, Jampani, Varun

We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation. Unlike previous methods that rely on separately trained generative models for video generation and novel view s

Externí odkaz: http://arxiv.org/abs/2407.17470

Zobrazit plný text záznamu

Report

SMooDi: Stylized Motion Diffusion Model

Autor: Zhong, Lei, Xie, Yiming, Jampani, Varun, Sun, Deqing, Jiang, Huaizu

We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences. Unlike existing methods that either generate motion of various content or transfer style from one seq

Externí odkaz: http://arxiv.org/abs/2407.12783

Zobrazit plný text záznamu

Report

SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving

Autor: Xie, Yiming, Wei, Henglu, Liu, Zhenyi, Wang, Xiaoyu, Ji, Xiangyang

To advance research in learning-based defogging algorithms, various synthetic fog datasets have been developed. However, existing datasets created using the Atmospheric Scattering Model (ASM) or real-time rendering engines often struggle to produce p

Externí odkaz: http://arxiv.org/abs/2403.17094

Zobrazit plný text záznamu

Report

HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models

Autor: Peng, Xiaogang, Xie, Yiming, Wu, Zizhao, Jampani, Varun, Sun, Deqing, Jiang, Huaizu

We address the problem of generating realistic 3D human-object interactions (HOIs) driven by textual prompts. To this end, we take a modular design and decompose the complex task into simpler sub-tasks. We first develop a dual-branch diffusion model

Externí odkaz: http://arxiv.org/abs/2312.06553

Zobrazit plný text záznamu

Report

OmniControl: Control Any Joint at Any Time for Human Motion Generation

Autor: Xie, Yiming, Jampani, Varun, Zhong, Lei, Sun, Deqing, Jiang, Huaizu

We present a novel approach named OmniControl for incorporating flexible spatial control signals into a text-conditioned human motion generation model based on the diffusion process. Unlike previous methods that can only control the pelvis trajectory

Externí odkaz: http://arxiv.org/abs/2310.08580

Zobrazit plný text záznamu

Report

Pixel-Aligned Recurrent Queries for Multi-View 3D Object Detection

Autor: Xie, Yiming, Jiang, Huaizu, Gkioxari, Georgia, Straub, Julian

We present PARQ - a multi-view 3D object detector with transformer and pixel-aligned recurrent queries. Unlike previous works that use learnable features or only encode 3D point positions as queries in the decoder, PARQ leverages appearance-enhanced

Externí odkaz: http://arxiv.org/abs/2310.01401

Zobrazit plný text záznamu

Report

Diagnosing Human-object Interaction Detectors

Autor: Zhu, Fangrui, Xie, Yiming, Xie, Weidi, Jiang, Huaizu

We have witnessed significant progress in human-object interaction (HOI) detection. The reliance on mAP (mean Average Precision) scores as a summary metric, however, does not provide sufficient insight into the nuances of model performance (e.g., why

Externí odkaz: http://arxiv.org/abs/2308.08529

Zobrazit plný text záznamu

Report

A Strong Baseline for Point Cloud Registration via Direct Superpoints Matching

Autor: Gupta, Aniket, Xie, Yiming, Singh, Hanumant, Jiang, Huaizu

Deep neural networks endow the downsampled superpoints with highly discriminative feature representations. Previous dominant point cloud registration approaches match these feature representations as the first step, e.g., using the Sinkhorn algorithm

Externí odkaz: http://arxiv.org/abs/2307.01362

Zobrazit plný text záznamu

Report

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Autor: Xie, Yiming, Gadelha, Matheus, Yang, Fengting, Zhou, Xiaowei, Jiang, Huaizu

We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video. Unlike previous works that detect planes in 2D from a single image, PlanarRecon incrementally detects planes in

Externí odkaz: http://arxiv.org/abs/2206.07710

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání