Výsledky vyhledávání

Report

ImageNet3D: Towards General-Purpose Object-Level 3D Understanding

Autor: Ma, Wufei, Zeng, Guanning, Zhang, Guofeng, Liu, Qihao, Zhang, Letian, Kortylewski, Adam, Liu, Yaoyao, Yuille, Alan

A vision model with general-purpose object-level 3D understanding should be capable of inferring both 2D (e.g., class name and bounding box) and 3D information (e.g., 3D location and 3D viewpoint) for arbitrary rigid objects in natural images. This i

Externí odkaz: http://arxiv.org/abs/2406.09613

Zobrazit plný text záznamu

Report

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Autor: Liu, Qihao, Zeng, Zhanpeng, He, Ju, Yu, Qihang, Shen, Xiaohui, Chen, Liang-Chieh

This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. Diffusion models have gained prominence for their effectiveness in high-fidelity image generation.

Externí odkaz: http://arxiv.org/abs/2406.09416

Zobrazit plný text záznamu

Report

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data

Autor: Liu, Qihao, Zhang, Yi, Bai, Song, Kortylewski, Adam, Yuille, Alan

We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets (represented by Neural Radiance Fields) from text prompts. Unlike recent 3D generative models that rely on clean and well-aligned 3D data, limiting them t

Externí odkaz: http://arxiv.org/abs/2406.04322

Zobrazit plný text záznamu

Report

Continual Adversarial Defense

Autor: Wang, Qian, Liu, Yaoyao, Ling, Hefei, Li, Yingwei, Liu, Qihao, Li, Ping, Chen, Jiazhong, Yuille, Alan, Yu, Ning

In response to the rapidly evolving nature of adversarial attacks against visual classifiers on a monthly basis, numerous defenses have been proposed to generalize against as many known attacks as possible. However, designing a defense method that ge

Externí odkaz: http://arxiv.org/abs/2312.09481

Zobrazit plný text záznamu

Report

General Object Foundation Model for Images and Videos at Scale

Autor: Wu, Junfeng, Jiang, Yi, Liu, Qihao, Yuan, Zehuan, Bai, Xiang, Bai, Song

We present GLEE in this work, an object-level foundation model for locating and identifying objects in images and videos. Through a unified framework, GLEE accomplishes detection, segmentation, tracking, grounding, and identification of arbitrary obj

Externí odkaz: http://arxiv.org/abs/2312.09158

Zobrazit plný text záznamu

Report

Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape

Accurately estimating the 3D pose and shape is an essential step towards understanding animal behavior, and can potentially benefit many downstream applications, such as wildlife conservation. However, research in this area is held back by the lack o

Externí odkaz: http://arxiv.org/abs/2308.11737

Zobrazit plný text záznamu

Report

Generating Images with 3D Annotations Using Diffusion Models

Autor: Ma, Wufei, Liu, Qihao, Wang, Jiahao, Wang, Angtian, Yuan, Xiaoding, Zhang, Yi, Xiao, Zihao, Zhang, Guofeng, Lu, Beijia, Duan, Ruxiao, Qi, Yongrui, Kortylewski, Adam, Liu, Yaoyao, Yuille, Alan

Diffusion models have emerged as a powerful generative method, capable of producing stunning photo-realistic images from natural language descriptions. However, these models lack explicit control over the 3D structure in the generated images. Consequ

Externí odkaz: http://arxiv.org/abs/2306.08103

Zobrazit plný text záznamu

Report

Intriguing Properties of Text-guided Diffusion Models

Autor: Liu, Qihao, Kortylewski, Adam, Bai, Yutong, Bai, Song, Yuille, Alan

Text-guided diffusion models (TDMs) are widely applied but can fail unexpectedly. Common failures include: (i) natural-looking text prompts generating images with the wrong content, or (ii) different random samples of the latent variables that genera

Externí odkaz: http://arxiv.org/abs/2306.00974

Zobrazit plný text záznamu

Report

InstMove: Instance Motion for Object-centric Video Segmentation

Autor: Liu, Qihao, Wu, Junfeng, Jiang, Yi, Bai, Xiang, Yuille, Alan, Bai, Song

Despite significant efforts, cutting-edge video segmentation methods still remain sensitive to occlusion and rapid movement, due to their reliance on the appearance of objects in the form of object embeddings, which are vulnerable to these disturbanc

Externí odkaz: http://arxiv.org/abs/2303.08132

Zobrazit plný text záznamu

Report

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

Autor: Liu, Qihao, Kortylewski, Adam, Yuille, Alan

Human pose and shape (HPS) estimation methods achieve remarkable results. However, current HPS benchmarks are mostly designed to test models in scenarios that are similar to the training data. This can lead to critical situations in real-world applic

Externí odkaz: http://arxiv.org/abs/2303.07337

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání