Zobrazeno 1 - 10
of 394
pro vyhledávání: '"Chen Yilun"'
While the capabilities of autonomous driving have advanced rapidly, merging into dense traffic remains a significant challenge, many motion planning methods for this scenario have been proposed but it is hard to evaluate them. Most existing closed-lo
Externí odkaz:
http://arxiv.org/abs/2410.15912
3D visual grounding is crucial for robots, requiring integration of natural language and 3D scene understanding. Traditional methods depending on supervised learning with 3D point clouds are limited by scarce datasets. Recently zero-shot methods leve
Externí odkaz:
http://arxiv.org/abs/2410.13860
Large language models (LLMs) have shown promise in many natural language understanding tasks, including content moderation. However, these models can be expensive to query in real-time and do not allow for a community-specific approach to content mod
Externí odkaz:
http://arxiv.org/abs/2410.13155
Autor:
Zhang, Wei, Li, Pengfei, Wang, Junli, Sun, Bingchuan, Jin, Qihao, Bao, Guangjun, Rui, Shibo, Yu, Yang, Ding, Wenchao, Li, Peng, Chen, Yilun
Automatic Emergency Braking (AEB) systems are a crucial component in ensuring the safety of passengers in autonomous vehicles. Conventional AEB systems primarily rely on closed-set perception modules to recognize traffic conditions and assess collisi
Externí odkaz:
http://arxiv.org/abs/2410.08616
Autor:
Wang, Hanqing, Chen, Jiahe, Huang, Wensi, Ben, Qingwei, Wang, Tai, Mi, Boyu, Huang, Tao, Zhao, Siheng, Chen, Yilun, Yang, Sizhe, Cao, Peizhou, Yu, Wenye, Ye, Zichao, Li, Jialun, Long, Junfeng, Wang, Zirui, Wang, Huiling, Zhao, Ying, Tu, Zhongying, Qiao, Yu, Lin, Dahua, Pang, Jiangmiao
Recent works have been exploring the scaling laws in the field of Embodied AI. Given the prohibitive costs of collecting real-world data, we believe the Simulation-to-Real (Sim2Real) paradigm is a crucial step for scaling the learning of embodied mod
Externí odkaz:
http://arxiv.org/abs/2407.10943
Object-oriented embodied navigation aims to locate specific objects, defined by category or depicted in images. Existing methods often struggle to generalize to open vocabulary goals without extensive training data. While recent advances in Vision-La
Externí odkaz:
http://arxiv.org/abs/2407.09016
Autor:
Lyu, Ruiyuan, Wang, Tai, Lin, Jingli, Yang, Shuai, Mao, Xiaohan, Chen, Yilun, Xu, Runsen, Huang, Haifeng, Zhu, Chenming, Lin, Dahua, Pang, Jiangmiao
With the emergence of LLMs and their integration with other data modalities, multi-modal 3D perception attracts more attention due to its connectivity to the physical world and makes rapid progress. However, limited by existing datasets, previous wor
Externí odkaz:
http://arxiv.org/abs/2406.09401
Severe data imbalance naturally exists among web-scale vision-language datasets. Despite this, we find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning, and demonstrates significant effectiv
Externí odkaz:
http://arxiv.org/abs/2405.21070
Autor:
Chen, Yilun, Yang, Shuai, Huang, Haifeng, Wang, Tai, Xu, Runsen, Lyu, Ruiyuan, Lin, Dahua, Pang, Jiangmiao
Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning. In this study, we propose Grounded 3D-LLM, which explores the potential of 3D large multi-modal models (3D L
Externí odkaz:
http://arxiv.org/abs/2405.10370
Autor:
Lyu, Xiaoyang, Sun, Yang-Tian, Huang, Yi-Hua, Wu, Xiuzhe, Yang, Ziyi, Chen, Yilun, Pang, Jiangmiao, Qi, Xiaojuan
In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.
Externí odkaz:
http://arxiv.org/abs/2404.00409