Výsledky vyhledávání - "Ding, Xinpeng"

Report

Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical Images

Autor: Wang, Hualiang, Lin, Yiqun, Ding, Xinpeng, Li, Xiaomeng

General networks for 3D medical image segmentation have recently undergone extensive exploration. Behind the exceptional performance of these networks lies a significant demand for a large volume of pixel-level annotated data, which is time-consuming

Externí odkaz: http://arxiv.org/abs/2409.08492

Zobrazit plný text záznamu

Report

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Autor: Huang, Runhui, Ding, Xinpeng, Wang, Chunwei, Han, Jianhua, Liu, Yulong, Zhao, Hengshuang, Xu, Hang, Hou, Lu, Zhang, Wei, Liang, Xiaodan

High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is t

Externí odkaz: http://arxiv.org/abs/2407.08706

Zobrazit plný text záznamu

Report

C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Autor: Lin, Yiqun, Yang, Jiewen, Wang, Hualiang, Ding, Xinpeng, Zhao, Wei, Li, Xiaomeng

Cone beam computed tomography (CBCT) is an important imaging technology widely used in medical scenarios, such as diagnosis and preoperative planning. Using fewer projection views to reconstruct CT, also known as sparse-view reconstruction, can reduc

Externí odkaz: http://arxiv.org/abs/2406.03902

Zobrazit plný text záznamu

Report

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

Autor: Ding, Xinpeng, Han, Jinahua, Xu, Hang, Liang, Xiaodan, Zhang, Wei, Li, Xiaomeng

The rise of multimodal large language models (MLLMs) has spurred interest in language-based driving tasks. However, existing research typically focuses on limited tasks and often omits key multi-view and temporal information which is crucial for robu

Externí odkaz: http://arxiv.org/abs/2401.00988

Zobrazit plný text záznamu

Report

EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language Model

Autor: Li, Guozhang, Ding, Xinpeng, Cheng, De, Li, Jie, Wang, Nannan, Gao, Xinbo

Early weakly supervised video grounding (WSVG) methods often struggle with incomplete boundary detection due to the absence of temporal boundary annotations. To bridge the gap between video-level and boundary-level annotation, explicit-supervision me

Externí odkaz: http://arxiv.org/abs/2312.02483

Zobrazit plný text záznamu

Report

GraphEcho: Graph-Driven Unsupervised Domain Adaptation for Echocardiogram Video Segmentation

Autor: Yang, Jiewen, Ding, Xinpeng, Zheng, Ziyang, Xu, Xiaowei, Li, Xiaomeng

Echocardiogram video segmentation plays an important role in cardiac disease diagnosis. This paper studies the unsupervised domain adaption (UDA) for echocardiogram video segmentation, where the goal is to generalize the model trained on the source d

Externí odkaz: http://arxiv.org/abs/2309.11145

Zobrazit plný text záznamu

Report

GL-Fusion: Global-Local Fusion Network for Multi-view Echocardiogram Video Segmentation

Autor: Zheng, Ziyang, Yang, Jiewen, Ding, Xinpeng, Xu, Xiaowei, Li, Xiaomeng

Cardiac structure segmentation from echocardiogram videos plays a crucial role in diagnosing heart disease. The combination of multi-view echocardiogram data is essential to enhance the accuracy and robustness of automated methods. However, due to th

Externí odkaz: http://arxiv.org/abs/2309.11144

Zobrazit plný text záznamu

Report

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

Autor: Ding, Xinpeng, Han, Jianhua, Xu, Hang, Zhang, Wei, Li, Xiaomeng

Autonomous driving systems generally employ separate models for different tasks resulting in intricate designs. For the first time, we leverage singular multimodal large language models (MLLMs) to consolidate multiple autonomous driving tasks from vi

Externí odkaz: http://arxiv.org/abs/2309.05186

Zobrazit plný text záznamu

Report

Uniformly Distributed Category Prototype-Guided Vision-Language Framework for Long-Tail Recognition

Autor: Fu, Siming, He, Xiaoxuan, Ding, Xinpeng, Cao, Yuchen, Wang, Hualiang

Publikováno v: ACM MM2023

Recently, large-scale pre-trained vision-language models have presented benefits for alleviating class imbalance in long-tailed recognition. However, the long-tailed data distribution can corrupt the representation space, where the distance between h

Externí odkaz: http://arxiv.org/abs/2308.12522

Zobrazit plný text záznamu

Report

Context-Aware Pseudo-Label Refinement for Source-Free Domain Adaptive Fundus Image Segmentation

Autor: Huai, Zheang, Ding, Xinpeng, Li, Yi, Li, Xiaomeng

In the domain adaptation problem, source data may be unavailable to the target client side due to privacy or intellectual property issues. Source-free unsupervised domain adaptation (SF-UDA) aims at adapting a model trained on the source side to alig

Externí odkaz: http://arxiv.org/abs/2308.07731

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání