Zobrazeno 1 - 10
of 180
pro vyhledávání: '"Lin, Bingqian"'
Autor:
Zhang, Kaidong, Ren, Pengzhen, Lin, Bingqian, Lin, Junfan, Ma, Shikui, Xu, Hang, Liang, Xiaodan
Language-guided robotic manipulation is a challenging task that requires an embodied agent to follow abstract user instructions to accomplish various complex manipulation tasks. Previous work trivially fitting the data without revealing the relation
Externí odkaz:
http://arxiv.org/abs/2410.10394
LLM-based agents have demonstrated impressive zero-shot performance in vision-language navigation (VLN) task. However, existing LLM-based methods often focus only on solving high-level task planning by selecting nodes in predefined navigation graphs
Externí odkaz:
http://arxiv.org/abs/2407.05890
Autor:
Lin, Bingqian, Nie, Yunshuang, Wei, Ziming, Zhu, Yi, Xu, Hang, Ma, Shikui, Liu, Jianzhuang, Liang, Xiaodan
Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse visual observations. However, pre
Externí odkaz:
http://arxiv.org/abs/2405.18721
Autor:
Lin, Bingqian, Nie, Yunshuang, Wei, Ziming, Chen, Jiaqi, Ma, Shikui, Han, Jianhua, Xu, Hang, Chang, Xiaojun, Liang, Xiaodan
Vision-and-Language Navigation (VLN), as a crucial research problem of Embodied AI, requires an embodied agent to navigate through complex 3D environments following natural language instructions. Recent research has highlighted the promising capacity
Externí odkaz:
http://arxiv.org/abs/2403.07376
Publikováno v:
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI,2023)
Vision-and-language navigation (VLN) asks an agent to follow a given language instruction to navigate through a real 3D environment. Despite significant advances, conventional VLN agents are trained typically under disturbance-free environments and m
Externí odkaz:
http://arxiv.org/abs/2403.05770
Embodied agents equipped with GPT as their brains have exhibited extraordinary decision-making and generalization abilities across various tasks. However, existing zero-shot agents for vision-and-language navigation (VLN) only prompt GPT-4 to select
Externí odkaz:
http://arxiv.org/abs/2401.07314
Autor:
Lin, Bingqian, Chen, Zicong, Li, Mingjie, Lin, Haokun, Xu, Hang, Zhu, Yi, Liu, Jianzhuang, Cai, Wenjia, Yang, Lei, Zhao, Shen, Wu, Chenfei, Chen, Ling, Chang, Xiaojun, Yang, Yi, Xing, Lei, Liang, Xiaodan
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks, which is very practical in the medical domain. It can significantly reduce the requirement of large amounts of task-specific data by suffici
Externí odkaz:
http://arxiv.org/abs/2304.14204
Publikováno v:
CVPR 2023
Automatic radiology reporting has great clinical potential to relieve radiologists from heavy workloads and improve diagnosis interpretation. Recently, researchers have enhanced data-driven neural networks with medical knowledge graphs to eliminate t
Externí odkaz:
http://arxiv.org/abs/2303.10323
Vision-Language Navigation (VLN) is a challenging task which requires an agent to align complex visual observations to language instructions to reach the goal position. Most existing VLN agents directly learn to align the raw directional features and
Externí odkaz:
http://arxiv.org/abs/2302.06072
Vision-Language Navigation (VLN) is a challenging task that requires an embodied agent to perform action-level modality alignment, i.e., make instruction-asked actions sequentially in complex visual environments. Most existing VLN agents learn the in
Externí odkaz:
http://arxiv.org/abs/2205.15509