Výsledky vyhledávání

Report

SceneCraft: Layout-Guided 3D Scene Generation

Autor: Yang, Xiuyu, Man, Yunze, Chen, Jun-Kun, Wang, Yu-Xiong

The creation of complex 3D scenes tailored to user specifications has been a tedious and challenging task with traditional 3D modeling tools. Although some pioneering methods have achieved automatic text-to-3D generation, they are generally limited t

Externí odkaz: http://arxiv.org/abs/2410.09049

Zobrazit plný text záznamu

Report

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

Autor: Man, Yunze, Zheng, Shuhong, Bao, Zhipeng, Hebert, Martial, Gui, Liang-Yan, Wang, Yu-Xiong

Complex 3D scene understanding has gained increasing attention, with scene encoding strategies playing a crucial role in this success. However, the optimal scene encoding strategies for various scenarios remain unclear, particularly compared to their

Externí odkaz: http://arxiv.org/abs/2409.03757

Zobrazit plný text záznamu

Report

Floating No More: Object-Ground Reconstruction from a Single Image

Autor: Man, Yunze, Sheng, Yichen, Zhang, Jianming, Gui, Liang-Yan, Wang, Yu-Xiong

Recent advancements in 3D object reconstruction from single images have primarily focused on improving the accuracy of object shapes. Yet, these techniques often fail to accurately capture the inter-relation between the object, ground, and camera. As

Externí odkaz: http://arxiv.org/abs/2407.18914

Zobrazit plný text záznamu

Report

Situational Awareness Matters in 3D Vision Language Reasoning

Autor: Man, Yunze, Gui, Liang-Yan, Wang, Yu-Xiong

Being able to carry out complicated vision language reasoning tasks in 3D space represents a significant milestone in developing household robots and human-centered embodied AI. In this work, we demonstrate that a critical and distinct challenge in 3

Externí odkaz: http://arxiv.org/abs/2406.07544

Zobrazit plný text záznamu

Report

Frozen Transformers in Language Models Are Effective Visual Encoder Layers

Autor: Pang, Ziqi, Xie, Ziyang, Man, Yunze, Wang, Yu-Xiong

This paper reveals that large language models (LLMs), despite being trained solely on textual data, are surprisingly strong encoders for purely visual tasks in the absence of language. Even more intriguingly, this can be achieved by a simple yet prev

Externí odkaz: http://arxiv.org/abs/2310.12973

Zobrazit plný text záznamu

Report

DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception

Autor: Man, Yunze, Gui, Liang-Yan, Wang, Yu-Xiong

Closing the domain gap between training and deployment and incorporating multiple sensor modalities are two challenging yet critical topics for self-driving. Existing work only focuses on single one of the above topics, overlooking the simultaneous d

Externí odkaz: http://arxiv.org/abs/2305.03724

Zobrazit plný text záznamu

Report

Fast Graph Neural Tangent Kernel via Kronecker Sketching

Autor: Jiang, Shunhua, Man, Yunze, Song, Zhao, Yu, Zheng, Zhuo, Danyang

Many deep learning tasks have to deal with graphs (e.g., protein structures, social networks, source code abstract syntax trees). Due to the importance of these tasks, people turned to Graph Neural Networks (GNNs) as the de facto method for learning

Externí odkaz: http://arxiv.org/abs/2112.02446

Zobrazit plný text záznamu

Report

Multi-Echo LiDAR for 3D Object Detection

Autor: Man, Yunze, Weng, Xinshuo, Sivakuma, Prasanna Kumar, O'Toole, Matthew, Kitani, Kris

LiDAR sensors can be used to obtain a wide range of measurement signals other than a simple 3D point cloud, and those signals can be leveraged to improve perception tasks like 3D object detection. A single laser pulse can be partially reflected by mu

Externí odkaz: http://arxiv.org/abs/2107.11470

Zobrazit plný text záznamu

Report

Multi-Modality Task Cascade for 3D Object Detection

Autor: Park, Jinhyung, Weng, Xinshuo, Man, Yunze, Kitani, Kris

Point clouds and RGB images are naturally complementary modalities for 3D visual understanding - the former provides sparse but accurate locations of points on objects, while the latter contains dense color and texture information. Despite this poten

Externí odkaz: http://arxiv.org/abs/2107.04013

Zobrazit plný text záznamu

Report

Graph Neural Networks for 3D Multi-Object Tracking

Autor: Weng, Xinshuo, Wang, Yongxin, Man, Yunze, Kitani, Kris

3D Multi-object tracking (MOT) is crucial to autonomous systems. Recent work often uses a tracking-by-detection pipeline, where the feature of each object is extracted independently to compute an affinity matrix. Then, the affinity matrix is passed t

Externí odkaz: http://arxiv.org/abs/2008.09506

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání