Výsledky vyhledávání

Report

FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction

Autor: Fang, Irving, Shi, Kairui, He, Xujin, Tan, Siqi, Wang, Yifan, Zhao, Hanwen, Huang, Hung-Jui, Yuan, Wenzhen, Feng, Chen, Zhang, Jing

Humans effortlessly integrate common-sense knowledge with sensory input from vision and touch to understand their surroundings. Emulating this capability, we introduce FusionSense, a novel 3D reconstruction framework that enables robots to fuse prior

Externí odkaz: http://arxiv.org/abs/2410.08282

Zobrazit plný text záznamu

Report

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Autor: Ruan, Jiacheng, Yuan, Wenzhen, Lin, Zehao, Liao, Ning, Li, Zhiyu, Xiong, Feiyu, Liu, Ting, Fu, Yuzhuo

Large visual-language models (LVLMs) have achieved great success in multiple applications. However, they still encounter challenges in complex scenes, especially those involving camouflaged objects. This is primarily due to the lack of samples relate

Externí odkaz: http://arxiv.org/abs/2409.16084

Zobrazit plný text záznamu

Report

RenderWorld: World Model with Self-Supervised 3D Label

Autor: Yan, Ziyang, Dong, Wenzhen, Shao, Yihua, Lu, Yuhang, Haiyang, Liu, Liu, Jingwen, Wang, Haozhe, Wang, Zhe, Wang, Yan, Remondino, Fabio, Ma, Yuexin

End-to-end autonomous driving with vision-only is not only more cost-effective compared to LiDAR-vision fusion but also more reliable than traditional methods. To achieve a economical and robust purely visual autonomous driving system, we propose Ren

Externí odkaz: http://arxiv.org/abs/2409.11356

Zobrazit plný text záznamu

Report

SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation

Autor: Yu, Jieming, Wang, An, Dong, Wenzhen, Xu, Mengya, Islam, Mobarakol, Wang, Jie, Bai, Long, Ren, Hongliang

The recent Segment Anything Model (SAM) 2 has demonstrated remarkable foundational competence in semantic segmentation, with its memory mechanism and mask decoder further addressing challenges in video tracking and object occlusion, thereby achieving

Externí odkaz: http://arxiv.org/abs/2408.04593

Zobrazit plný text záznamu

Report

Open-Vocabulary Audio-Visual Semantic Segmentation

Autor: Guo, Ruohao, Qu, Liao, Niu, Dantong, Qi, Yanyu, Yue, Wenzhen, Shi, Ji, Xing, Bowei, Ying, Xianghua

Audio-visual semantic segmentation (AVSS) aims to segment and classify sounding objects in videos with acoustic cues. However, most approaches operate on the close-set assumption and only identify pre-defined categories from training data, lacking th

Externí odkaz: http://arxiv.org/abs/2407.21721

Zobrazit plný text záznamu

Report

Free-SurGS: SfM-Free 3D Gaussian Splatting for Surgical Scene Reconstruction

Autor: Guo, Jiaxin, Wang, Jiangliu, Kang, Di, Dong, Wenzhen, Wang, Wenting, Liu, Yun-hui

Real-time 3D reconstruction of surgical scenes plays a vital role in computer-assisted surgery, holding a promise to enhance surgeons' visibility. Recent advancements in 3D Gaussian Splatting (3DGS) have shown great potential for real-time novel view

Externí odkaz: http://arxiv.org/abs/2407.02918

Zobrazit plný text záznamu

Report

Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale

Autor: Zheng, Wenzhen, Pan, Wenbo, Xu, Xu, Qin, Libo, Yue, Li, Zhou, Ming

In recent years, Large Language Models (LLMs) have made significant strides towards Artificial General Intelligence. However, training these models from scratch requires substantial computational resources and vast amounts of text data. In this paper

Externí odkaz: http://arxiv.org/abs/2407.02118

Zobrazit plný text záznamu

Report

An Intelligent Robotic System for Perceptive Pancake Batter Stirring and Precise Pouring

Autor: Luo, Xinyuan, Jin, Shengmiao, Huang, Hung-Jui, Yuan, Wenzhen

Cooking robots have long been desired by the commercial market, while the technical challenge is still significant. A major difficulty comes from the demand of perceiving and handling liquid with different properties. This paper presents a robot syst

Externí odkaz: http://arxiv.org/abs/2407.01755

Zobrazit plný text záznamu

Report

Sub-Adjacent Transformer: Improving Time Series Anomaly Detection with Reconstruction Error from Sub-Adjacent Neighborhoods

Autor: Yue, Wenzhen, Ying, Xianghua, Guo, Ruohao, Chen, DongDong, Shi, Ji, Xing, Bowei, Zhu, Yuqing, Chen, Taiyan

In this paper, we present the Sub-Adjacent Transformer with a novel attention mechanism for unsupervised time series anomaly detection. Unlike previous approaches that rely on all the points within some neighborhood for time point reconstruction, our

Externí odkaz: http://arxiv.org/abs/2404.18948

Zobrazit plný text záznamu

Report

Scalable, Simulation-Guided Compliant Tactile Finger Design

Autor: Ma, Yuxiang, Agarwal, Arpit, Liu, Sandra Q., Yuan, Wenzhen, Adelson, Edward H.

Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing complia

Externí odkaz: http://arxiv.org/abs/2403.04638

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání