Výsledky vyhledávání - "Sima, Chonghao"

Report

Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Autor: Ding, Kairui, Chen, Boyuan, Su, Yuchen, Gao, Huan-ang, Jin, Bu, Sima, Chonghao, Zhang, Wuqiang, Li, Xiaohui, Barsch, Paul, Li, Hongyang, Zhao, Hao

End-to-end architectures in autonomous driving (AD) face a significant challenge in interpretability, impeding human-AI trust. Human-friendly natural language has been explored for tasks such as driving explanation and 3D captioning. However, previou

Externí odkaz: http://arxiv.org/abs/2409.06702

Zobrazit plný text záznamu

Report

DriveLM: Driving with Graph Visual Question Answering

Autor: Sima, Chonghao, Renz, Katrin, Chitta, Kashyap, Chen, Li, Zhang, Hanxue, Xie, Chengen, Beißwenger, Jens, Luo, Ping, Geiger, Andreas, Li, Hongyang

We study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems to boost generalization and enable interactivity with human users. While recent approaches adapt VLMs to driving via single-round v

Externí odkaz: http://arxiv.org/abs/2312.14150

Zobrazit plný text záznamu

Report

Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection

Autor: Huang, Linyan, Li, Zhiqi, Sima, Chonghao, Wang, Wenhai, Wang, Jingdong, Qiao, Yu, Li, Hongyang

Current research is primarily dedicated to advancing the accuracy of camera-only 3D object detectors (apprentice) through the knowledge transferred from LiDAR- or multi-modal-based counterparts (expert). However, the presence of the domain gap betwee

Externí odkaz: http://arxiv.org/abs/2310.15670

Zobrazit plný text záznamu

Report

Scene as Occupancy

Autor: Sima, Chonghao, Tong, Wenwen, Wang, Tai, Chen, Li, Wu, Silei, Deng, Hanming, Gu, Yi, Lu, Lewei, Luo, Ping, Lin, Dahua, Li, Hongyang

Human driver can easily describe the complex traffic scene by visual system. Such an ability of precise perception is essential for driver's planning. To achieve this, a geometry-aware representation that quantizes the physical 3D scene into structur

Externí odkaz: http://arxiv.org/abs/2306.02851

Zobrazit plný text záznamu

Report

OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping

Autor: Wang, Huijie, Li, Tianyu, Li, Yang, Chen, Li, Sima, Chonghao, Liu, Zhenbo, Wang, Bangjun, Jia, Peijin, Wang, Yuting, Jiang, Shengyin, Wen, Feng, Xu, Hang, Luo, Ping, Yan, Junchi, Zhang, Wei, Li, Hongyang

Accurately depicting the complex traffic scene is a vital component for autonomous vehicles to execute correct judgments. However, existing benchmarks tend to oversimplify the scene by solely focusing on lane perception tasks. Observing that human dr

Externí odkaz: http://arxiv.org/abs/2304.10440

Zobrazit plný text záznamu

Report

Sparse Dense Fusion for 3D Object Detection

Autor: Gao, Yulu, Sima, Chonghao, Shi, Shaoshuai, Di, Shangzhe, Liu, Si, Li, Hongyang

With the prevalence of multimodal learning, camera-LiDAR fusion has gained popularity in 3D object detection. Although multiple fusion approaches have been proposed, they can be classified into either sparse-only or dense-only fashion based on the fe

Externí odkaz: http://arxiv.org/abs/2304.04179

Zobrazit plný text záznamu

Report

Planning-oriented Autonomous Driving

Autor: Hu, Yihan, Yang, Jiazhi, Chen, Li, Li, Keyu, Sima, Chonghao, Zhu, Xizhou, Chai, Siqi, Du, Senyao, Lin, Tianwei, Wang, Wenhai, Lu, Lewei, Jia, Xiaosong, Liu, Qiang, Dai, Jifeng, Qiao, Yu, Li, Hongyang

Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either

Externí odkaz: http://arxiv.org/abs/2212.10156

Zobrazit plný text záznamu

Report

Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe

Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending and drawing extensive attention both from industry and academia. Conventional approaches for most autonomous driving algorithms perform detection, segmentatio

Externí odkaz: http://arxiv.org/abs/2209.05324

Zobrazit plný text záznamu

Report

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

Autor: Li, Zhiqi, Wang, Wenhai, Li, Hongyang, Xie, Enze, Sima, Chonghao, Lu, Tong, Yu, Qiao, Dai, Jifeng

3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations wi

Externí odkaz: http://arxiv.org/abs/2203.17270

Zobrazit plný text záznamu

Report

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark

Autor: Chen, Li, Sima, Chonghao, Li, Yang, Zheng, Zehan, Xu, Jiajie, Geng, Xiangwei, Li, Hongyang, He, Conghui, Shi, Jianping, Qiao, Yu, Yan, Junchi

Methods for 3D lane detection have been recently proposed to address the issue of inaccurate lane layouts in many autonomous driving scenarios (uphill/downhill, bump, etc.). Previous work struggled in complex cases due to their simple designs of the

Externí odkaz: http://arxiv.org/abs/2203.11089

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání