Zobrazeno 1 - 10
of 9 506
pro vyhledávání: '"An, Wenzhen"'
Autor:
Fang, Irving, Shi, Kairui, He, Xujin, Tan, Siqi, Wang, Yifan, Zhao, Hanwen, Huang, Hung-Jui, Yuan, Wenzhen, Feng, Chen, Zhang, Jing
Humans effortlessly integrate common-sense knowledge with sensory input from vision and touch to understand their surroundings. Emulating this capability, we introduce FusionSense, a novel 3D reconstruction framework that enables robots to fuse prior
Externí odkaz:
http://arxiv.org/abs/2410.08282
Autor:
Ruan, Jiacheng, Yuan, Wenzhen, Lin, Zehao, Liao, Ning, Li, Zhiyu, Xiong, Feiyu, Liu, Ting, Fu, Yuzhuo
Large visual-language models (LVLMs) have achieved great success in multiple applications. However, they still encounter challenges in complex scenes, especially those involving camouflaged objects. This is primarily due to the lack of samples relate
Externí odkaz:
http://arxiv.org/abs/2409.16084
Autor:
Yan, Ziyang, Dong, Wenzhen, Shao, Yihua, Lu, Yuhang, Haiyang, Liu, Liu, Jingwen, Wang, Haozhe, Wang, Zhe, Wang, Yan, Remondino, Fabio, Ma, Yuexin
End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we propose Ren
Externí odkaz:
http://arxiv.org/abs/2409.11356
Autor:
Yu, Jieming, Wang, An, Dong, Wenzhen, Xu, Mengya, Islam, Mobarakol, Wang, Jie, Bai, Long, Ren, Hongliang
The recent Segment Anything Model (SAM) 2 has demonstrated remarkable foundational competence in semantic segmentation, with its memory mechanism and mask decoder further addressing challenges in video tracking and object occlusion, thereby achieving
Externí odkaz:
http://arxiv.org/abs/2408.04593
Autor:
Guo, Ruohao, Qu, Liao, Niu, Dantong, Qi, Yanyu, Yue, Wenzhen, Shi, Ji, Xing, Bowei, Ying, Xianghua
Audio-visual semantic segmentation (AVSS) aims to segment and classify sounding objects in videos with acoustic cues. However, most approaches operate on the close-set assumption and only identify pre-defined categories from training data, lacking th
Externí odkaz:
http://arxiv.org/abs/2407.21721
Real-time 3D reconstruction of surgical scenes plays a vital role in computer-assisted surgery, holding a promise to enhance surgeons' visibility. Recent advancements in 3D Gaussian Splatting (3DGS) have shown great potential for real-time novel view
Externí odkaz:
http://arxiv.org/abs/2407.02918
In recent years, Large Language Models (LLMs) have made significant strides towards Artificial General Intelligence. However, training these models from scratch requires substantial computational resources and vast amounts of text data. In this paper
Externí odkaz:
http://arxiv.org/abs/2407.02118
Cooking robots have long been desired by the commercial market, while the technical challenge is still significant. A major difficulty comes from the demand of perceiving and handling liquid with different properties. This paper presents a robot syst
Externí odkaz:
http://arxiv.org/abs/2407.01755
Autor:
Yue, Wenzhen, Ying, Xianghua, Guo, Ruohao, Chen, DongDong, Shi, Ji, Xing, Bowei, Zhu, Yuqing, Chen, Taiyan
In this paper, we present the Sub-Adjacent Transformer with a novel attention mechanism for unsupervised time series anomaly detection. Unlike previous approaches that rely on all the points within some neighborhood for time point reconstruction, our
Externí odkaz:
http://arxiv.org/abs/2404.18948
Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing complia
Externí odkaz:
http://arxiv.org/abs/2403.04638