Zobrazeno 1 - 10
of 2 334
pro vyhledávání: '"Zhang, Siyu"'
Autor:
Zhuang, Jiaheng, Wang, Guoan, Zhang, Siyu, Wang, Xiyang, Zhou, Hangning, Xu, Ziyao, Zhang, Chi, Li, Zhiheng
3D multi-object tracking and trajectory prediction are two crucial modules in autonomous driving systems. Generally, the two tasks are handled separately in traditional paradigms and a few methods have started to explore modeling these two tasks in a
Externí odkaz:
http://arxiv.org/abs/2406.19844
Neural implicit representations have recently demonstrated considerable potential in the field of visual simultaneous localization and mapping (SLAM). This is due to their inherent advantages, including low storage overhead and representation continu
Externí odkaz:
http://arxiv.org/abs/2405.15151
Autor:
Zhang, Diankun, Wang, Guoan, Zhu, Runwen, Zhao, Jianbo, Chen, Xiwu, Zhang, Siyu, Gong, Jiahao, Zhou, Qibin, Zhang, Wenyuan, Wang, Ningzi, Tan, Feiyang, Zhou, Hangning, Xu, Ziyao, Yao, Haotian, Zhang, Chi, Liu, Xiaojun, Di, Xiaoguang, Li, Bin
End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system. Despite simplicity and clarity, the performance of end-to-end autonomous driving methods on sub-tasks is still far behind the single-task methods.
Externí odkaz:
http://arxiv.org/abs/2404.06892
In this paper, we present two fast and interpretable decomposition methods for 2D homography, which are named Similarity-Kernel-Similarity (SKS) and Affine-Core-Affine (ACA) transformations respectively. Under the minimal $4$-point configuration, the
Externí odkaz:
http://arxiv.org/abs/2402.18008
Video compression is widely used in digital television, surveillance systems, and virtual reality. Real-time video decoding is crucial in practical scenarios. Recently, neural video compression (NVC) combines traditional coding with deep learning, ac
Externí odkaz:
http://arxiv.org/abs/2312.10716
Within the multimodal field, large vision-language models (LVLMs) have made significant progress due to their strong perception and reasoning capabilities in the visual and language systems. However, LVLMs are still plagued by the two critical issues
Externí odkaz:
http://arxiv.org/abs/2311.12639
Within the multimodal field, the key to integrating vision and language lies in establishing a good alignment strategy. Recently, benefiting from the success of self-supervised learning, significant progress has been made in multimodal semantic repre
Externí odkaz:
http://arxiv.org/abs/2310.13447
Goal-conditioned hierarchical reinforcement learning (HRL) presents a promising approach for enabling effective exploration in complex, long-horizon reinforcement learning (RL) tasks through temporal abstraction. Empirically, heightened inter-level c
Externí odkaz:
http://arxiv.org/abs/2309.13508
With the success of self-supervised learning, multimodal foundation models have rapidly adapted a wide range of downstream tasks driven by vision and language (VL) pretraining. State-of-the-art methods achieve impressive performance by pre-training o
Externí odkaz:
http://arxiv.org/abs/2308.09455
Autor:
Tu, Shangqing, Zhang, Zheyuan, Yu, Jifan, Li, Chunyang, Zhang, Siyu, Yao, Zijun, Hou, Lei, Li, Juanzi
Teaching assistants have played essential roles in the long history of education. However, few MOOC platforms are providing human or virtual teaching assistants to support learning for massive online students due to the complexity of real-world onlin
Externí odkaz:
http://arxiv.org/abs/2308.05935