Zobrazeno 1 - 10
of 36
pro vyhledávání: '"Zhu, Lanyun"'
The ubiquity and value of tables as semi-structured data across various domains necessitate advanced methods for understanding their complexity and vast amounts of information. Despite the impressive capabilities of large language models (LLMs) in ad
Externí odkaz:
http://arxiv.org/abs/2411.08516
Autor:
Chen, Tianrun, Yu, Chunan, Hu, Yuanqi, Li, Jing, Xu, Tao, Cao, Runlong, Zhu, Lanyun, Zang, Ying, Zhang, Yong, Li, Zejian, Sun, Linyun
In this paper, we propose Img2CAD, the first approach to our knowledge that uses 2D image inputs to generate CAD models with editable parameters. Unlike existing AI methods for 3D model generation using text or image inputs often rely on mesh-based r
Externí odkaz:
http://arxiv.org/abs/2410.03417
Many few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. A recent advance Mamba can also well capture intra-sequence dependencies, yet the complexity is
Externí odkaz:
http://arxiv.org/abs/2409.19613
Autor:
Chen, Tianrun, Lu, Ankang, Zhu, Lanyun, Ding, Chaotao, Yu, Chunan, Ji, Deyi, Li, Zejian, Sun, Lingyun, Mao, Papa, Zang, Ying
The advent of large models, also known as foundation models, has significantly transformed the AI research landscape, with models like Segment Anything (SAM) achieving notable success in diverse image segmentation scenarios. Despite its advancements,
Externí odkaz:
http://arxiv.org/abs/2408.04579
Autor:
Chen, Tianrun, Ding, Chaotao, Zhu, Lanyun, Xu, Tao, Ji, Deyi, Wang, Yan, Zang, Ying, Li, Zejian
Convolutional Neural Networks (CNNs) and Vision Transformers (ViT) have been pivotal in biomedical image segmentation, yet their ability to manage long-range dependencies remains constrained by inherent locality and computational overhead. To overcom
Externí odkaz:
http://arxiv.org/abs/2407.01530
In this paper, we address the challenge of Perspective-Invariant Learning in machine learning and computer vision, which involves enabling a network to understand images from varying perspectives to achieve consistent semantic interpretation. While s
Externí odkaz:
http://arxiv.org/abs/2406.10475
Autor:
Chen, Tianrun, Yu, Chunan, Li, Jing, Zhang, Jianqi, Zhu, Lanyun, Ji, Deyi, Zhang, Yong, Zang, Ying, Li, Zejian, Sun, Lingyun
In this paper, we introduce a new task: Zero-Shot 3D Reasoning Segmentation for parts searching and localization for objects, which is a new paradigm to 3D segmentation that transcends limitations for previous category-specific 3D semantic segmentati
Externí odkaz:
http://arxiv.org/abs/2405.19326
Autor:
Ji, Deyi, Gao, Siqi, Zhu, Lanyun, Zhu, Qi, Zhao, Yiru, Xu, Peng, Lu, Hongtao, Zhao, Feng, Ye, Jieping
In this paper, we address the challenge of multi-object tracking (MOT) in moving Unmanned Aerial Vehicle (UAV) scenarios, where irregular flight trajectories, such as hovering, turning left/right, and moving up/down, lead to significantly greater com
Externí odkaz:
http://arxiv.org/abs/2403.10830
Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified as a key f
Externí odkaz:
http://arxiv.org/abs/2402.18476
Autor:
Zang, Ying, Fu, Chenglong, Cao, Runlong, Zhu, Didi, Zhang, Min, Hu, Wenjun, Zhu, Lanyun, Chen, Tianrun
Referring expression segmentation (RES), a task that involves localizing specific instance-level objects based on free-form linguistic descriptions, has emerged as a crucial frontier in human-AI interaction. It demands an intricate understanding of b
Externí odkaz:
http://arxiv.org/abs/2402.05589