Zobrazeno 1 - 10
of 15 300
pro vyhledávání: '"WU, Feng"'
Integrating multimodal Electronic Health Records (EHR) data, such as numerical time series and free-text clinical reports, has great potential in predicting clinical outcomes. However, prior work has primarily focused on capturing temporal interactio
Externí odkaz:
http://arxiv.org/abs/2411.00696
Autor:
Liu, Haoyang, Wang, Jie, Zhang, Wanbo, Geng, Zijie, Kuang, Yufei, Li, Xijun, Li, Bin, Zhang, Yongdong, Wu, Feng
Mixed-integer linear programming (MILP) is one of the most popular mathematical formulations with numerous applications. In practice, improving the performance of MILP solvers often requires a large amount of high-quality data, which can be challengi
Externí odkaz:
http://arxiv.org/abs/2410.22806
Autor:
Xing, Long, Huang, Qidong, Dong, Xiaoyi, Lu, Jiajie, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, He, Conghui, Wang, Jiaqi, Wu, Feng, Lin, Dahua
In large vision-language models (LVLMs), images serve as inputs that carry a wealth of information. As the idiom "A picture is worth a thousand words" implies, representing a single image in current LVLMs can require hundreds or even thousands of tok
Externí odkaz:
http://arxiv.org/abs/2410.17247
Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM) -- which enhances models with up-to-date knowledge -- emerges as a promising
Externí odkaz:
http://arxiv.org/abs/2410.15116
We introduce D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR models. D-FINE comprises two key components: Fine-grained Distribution Refinement (FDR)
Externí odkaz:
http://arxiv.org/abs/2410.13842
Autor:
Li, Zhuoyuan, Liao, Junqi, Tang, Chuanbo, Zhang, Haotian, Li, Yuqi, Bian, Yifan, Sheng, Xihua, Feng, Xinmin, Li, Yao, Gao, Changsheng, Li, Li, Liu, Dong, Wu, Feng
Image/video coding has been a remarkable research area for both academia and industry for many years. Testing datasets, especially high-quality image/video datasets are desirable for the justified evaluation of coding-related research, practical appl
Externí odkaz:
http://arxiv.org/abs/2409.08481
Interactive segmentation algorithms based on click points have garnered significant attention from researchers in recent years. However, existing studies typically use sparse click maps as model inputs to segment specific target objects, which primar
Externí odkaz:
http://arxiv.org/abs/2408.06021
The emerging large models have achieved notable progress in the fields of natural language processing and computer vision. However, large models for neural video coding are still unexplored. In this paper, we try to explore how to build a large neura
Externí odkaz:
http://arxiv.org/abs/2407.19402
Inter prediction is a key technology to reduce the temporal redundancy in video coding. In natural videos, there are usually multiple moving objects with variable velocity, resulting in complex motion fields that are difficult to represent compactly.
Externí odkaz:
http://arxiv.org/abs/2407.11541
In-loop filtering (ILF) is a key technology for removing the artifacts in image/video coding standards. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards,
Externí odkaz:
http://arxiv.org/abs/2407.10926