Zobrazeno 1 - 10
of 241
pro vyhledávání: '"Zhang, Yanyong"'
The integration of large language models (LLMs) with robotics has significantly advanced robots' abilities in perception, cognition, and task planning. The use of natural language interfaces offers a unified approach for expressing the capability dif
Externí odkaz:
http://arxiv.org/abs/2409.16030
Autor:
You, Guoliang, Chu, Xiaomeng, Duan, Yifan, Li, Xingchen, Zhang, Sha, Ji, Jianmin, Zhang, Yanyong
Multi-modal systems enhance performance in autonomous driving but face inefficiencies due to indiscriminate processing within each modality. Additionally, the independent feature learning of each modality lacks interaction, which results in extracted
Externí odkaz:
http://arxiv.org/abs/2409.14170
Training deep learning models for semantic occupancy prediction is challenging due to factors such as a large number of occupancy cells, severe occlusion, limited visual cues, complicated driving scenarios, etc. Recent methods often adopt transformer
Externí odkaz:
http://arxiv.org/abs/2408.09859
Autor:
Lei, Jiayu, Zhang, Xiaoman, Wu, Chaoyi, Dai, Lisong, Zhang, Ya, Zhang, Yanyong, Wang, Yanfeng, Xie, Weidi, Li, Yuehua
Radiologists are tasked with interpreting a large number of images in a daily base, with the responsibility of generating corresponding reports. This demanding workload elevates the risk of human error, potentially leading to treatment delays, increa
Externí odkaz:
http://arxiv.org/abs/2407.16684
The recent advances in query-based multi-camera 3D object detection are featured by initializing object queries in the 3D space, and then sampling features from perspective-view images to perform multi-round query refinement. In such a framework, que
Externí odkaz:
http://arxiv.org/abs/2407.14923
Autor:
You, Guoliang, Chu, Xiaomeng, Duan, Yifan, Zhang, Wenyu, Li, Xingchen, Zhang, Sha, Li, Yao, Ji, Jianmin, Zhang, Yanyong
When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likel
Externí odkaz:
http://arxiv.org/abs/2407.11644
Autor:
Wu, Minghui, Xu, Luzhen, Zhang, Jie, Tang, Haitao, Yue, Yanyan, Liao, Ruizhi, Zhao, Jintao, Zhang, Zhengzhe, Wang, Yichi, Yan, Haoyin, Yu, Hongliang, Ma, Tongle, Liu, Jiachen, Wu, Chongliang, Li, Yongchao, Zhang, Yanyong, Fang, Xin, Zhang, Yue
This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case. We implement the fron
Externí odkaz:
http://arxiv.org/abs/2407.02052
The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies, owing to its advancement to accurately model the conditional distribution of policies. The intricate nature of real-world scenarios, characterized
Externí odkaz:
http://arxiv.org/abs/2407.01950
The deployment of Large Language Models (LLMs) on edge devices is increasingly important to enhance on-device intelligence. Weight quantization is crucial for reducing the memory footprint of LLMs on devices. However, low-bit LLMs necessitate mixed p
Externí odkaz:
http://arxiv.org/abs/2407.00088
Large Language Models (LLMs) possess extensive foundational knowledge and moderate reasoning abilities, making them suitable for general task planning in open-world scenarios. However, it is challenging to ground a LLM-generated plan to be executable
Externí odkaz:
http://arxiv.org/abs/2406.03367