Zobrazeno 1 - 10
of 58
pro vyhledávání: '"Xiong, Yuwen"'
Autor:
Hou, Zhi, Zhang, Tianyi, Xiong, Yuwen, Pu, Hengjun, Zhao, Chengyang, Tong, Ronglei, Qiao, Yu, Dai, Jifeng, Chen, Yuntao
Recent large visual-language action models pretrained on diverse robot datasets have demonstrated the potential for generalizing to new environments with a few in-domain data. However, those approaches usually predict discretized or continuous action
Externí odkaz:
http://arxiv.org/abs/2410.15959
In this paper, we introduce the big.LITTLE Vision Transformer, an innovative architecture aimed at achieving efficient visual recognition. This dual-transformer system is composed of two distinct blocks: the big performance block, characterized by it
Externí odkaz:
http://arxiv.org/abs/2410.10267
Self-supervised learning has driven significant progress in learning from single-subject, iconic images. However, there are still unanswered questions about the use of minimally-curated, naturalistic video data, which contain dense scenes with many i
Externí odkaz:
http://arxiv.org/abs/2408.11208
Autor:
Tian, Changyao, Zhu, Xizhou, Xiong, Yuwen, Wang, Weiyun, Chen, Zhe, Wang, Wenhai, Chen, Yuntao, Lu, Lewei, Lu, Tong, Zhou, Jie, Li, Hongsheng, Qiao, Yu, Dai, Jifeng
Developing generative models for interleaved image-text data has both research and practical value. It requires models to understand the interleaved sequences and subsequently generate images and text. However, existing attempts are limited by the is
Externí odkaz:
http://arxiv.org/abs/2401.10208
Autor:
Xiong, Yuwen, Li, Zhiqi, Chen, Yuntao, Wang, Feng, Zhu, Xizhou, Luo, Jiapeng, Wang, Wenhai, Lu, Tong, Li, Hongsheng, Qiao, Yu, Lu, Lewei, Zhou, Jie, Dai, Jifeng
We introduce Deformable Convolution v4 (DCNv4), a highly efficient and effective operator designed for a broad spectrum of vision applications. DCNv4 addresses the limitations of its predecessor, DCNv3, with two key enhancements: 1. removing softmax
Externí odkaz:
http://arxiv.org/abs/2401.06197
Autor:
Zhang, Lunjun, Yang, Anqi Joyce, Xiong, Yuwen, Casas, Sergio, Yang, Bin, Ren, Mengye, Urtasun, Raquel
In this paper, we study the problem of unsupervised object detection from 3D point clouds in self-driving scenes. We present a simple yet effective method that exploits (i) point clustering in near-range areas where the point clouds are dense, (ii) t
Externí odkaz:
http://arxiv.org/abs/2311.02007
LiDAR provides accurate geometric measurements of the 3D world. Unfortunately, dense LiDARs are very expensive and the point clouds captured by low-beam LiDAR are often sparse. To address these issues, we present UltraLiDAR, a data-driven framework f
Externí odkaz:
http://arxiv.org/abs/2311.01448
Autor:
Yang, Anqi Joyce, Casas, Sergio, Dvornik, Nikita, Segal, Sean, Xiong, Yuwen, Hu, Jordan Sir Kwang, Fang, Carter, Urtasun, Raquel
Publikováno v:
CoRL 2023
A major bottleneck to scaling-up training of self-driving perception systems are the human annotations required for supervision. A promising alternative is to leverage "auto-labelling" offboard perception models that are trained to automatically gene
Externí odkaz:
http://arxiv.org/abs/2311.01444
Self-driving vehicles (SDVs) must be rigorously tested on a wide range of scenarios to ensure safe deployment. The industry typically relies on closed-loop simulation to evaluate how the SDV interacts on a corpus of synthetic and real scenarios and v
Externí odkaz:
http://arxiv.org/abs/2311.01446
Learning world models can teach an agent how the world works in an unsupervised manner. Even though it can be viewed as a special case of sequence modeling, progress for scaling world models on robotic applications such as autonomous driving has been
Externí odkaz:
http://arxiv.org/abs/2311.01017