Zobrazeno 1 - 10
of 13 299
pro vyhledávání: '"Libin, P"'
Video summarization mainly aims to produce a compact, short, informative, and representative synopsis of raw videos, which is of great importance for browsing, analyzing, and understanding video content. Dominant video summarization approaches are ge
Externí odkaz:
http://arxiv.org/abs/2501.00882
Autor:
Wang, Yeyuan, Gao, Dehong, Li, Bin, Long, Rujiao, Yi, Lei, Cai, Xiaoyan, Yang, Libin, Zhang, Jinxia, Yu, Shanqing, Xuan, Qi
The impressive performance of Large Language Model (LLM) has prompted researchers to develop Multi-modal LLM (MLLM), which has shown great potential for various multi-modal tasks. However, current MLLM often struggles to effectively address fine-grai
Externí odkaz:
http://arxiv.org/abs/2412.16869
Existing Vision-Language Pretraining (VLP) methods have achieved remarkable improvements across a variety of vision-language tasks, confirming their effectiveness in capturing coarse-grained semantic correlations. However, their capability for fine-g
Externí odkaz:
http://arxiv.org/abs/2412.10029
Autor:
Ma, Yufei, Liang, Zihan, Dai, Huangyu, Chen, Ben, Gao, Dehong, Ran, Zhuoran, Zihan, Wang, Jin, Linbo, Jiang, Wen, Zhang, Guannan, Cai, Xiaoyan, Yang, Libin
The growing demand for larger-scale models in the development of \textbf{L}arge \textbf{L}anguage \textbf{M}odels (LLMs) poses challenges for efficient training within limited computational resources. Traditional fine-tuning methods often exhibit ins
Externí odkaz:
http://arxiv.org/abs/2412.07405
Estimating spatial distributions is important in data analysis, such as traffic flow forecasting and epidemic prevention. To achieve accurate spatial distribution estimation, the analysis needs to collect sufficient user data. However, collecting dat
Externí odkaz:
http://arxiv.org/abs/2412.06541
Autor:
Liu, Libin, Chen, Shen, Jia, Sen, Shi, Jingzhe, Jiang, Zhongyu, Jin, Can, Zongkai, Wu, Hwang, Jenq-Neng, Li, Lei
Spatial intelligence is foundational to AI systems that interact with the physical world, particularly in 3D scene generation and spatial comprehension. Current methodologies for 3D scene generation often rely heavily on predefined datasets, and stru
Externí odkaz:
http://arxiv.org/abs/2412.00091
Autor:
Liu, Nian, Liu, Libin, Zhang, Zilong, Wang, Zi, Xie, Hongzhao, Liu, Tengyu, Tong, Xinyi, Yang, Yaodong, He, Zhaofeng
Learning natural and diverse behaviors from human motion datasets remains challenging in physics-based character control. Existing conditional adversarial models often suffer from tight and biased embedding distributions where embeddings from the sam
Externí odkaz:
http://arxiv.org/abs/2411.06459
Pubic symphysis-fetal head segmentation in transperineal ultrasound images plays a critical role for the assessment of fetal head descent and progression. Existing transformer segmentation methods based on sparse attention mechanism use handcrafted s
Externí odkaz:
http://arxiv.org/abs/2410.10352
Dense Convolutional Network has been continuously refined to adopt a highly efficient and compact architecture, owing to its lightweight and efficient structure. However, the current Dense-like architectures are mainly designed manually, it becomes i
Externí odkaz:
http://arxiv.org/abs/2410.07499
In recent years, significant progress has been made in tumor segmentation within the field of digital pathology. However, variations in organs, tissue preparation methods, and image acquisition processes can lead to domain discrepancies among digital
Externí odkaz:
http://arxiv.org/abs/2409.11752