Zobrazeno 1 - 10
of 140
pro vyhledávání: '"Suo, Wei"'
Iterative methods are widely used for solving partial differential equations (PDEs). However, the difficulty in eliminating global low-frequency errors significantly limits their convergence speed. In recent years, neural networks have emerged as a n
Externí odkaz:
http://arxiv.org/abs/2410.05744
Human-object interactions (HOI) detection aims at capturing human-object pairs in images and corresponding actions. It is an important step toward high-level visual reasoning and scene understanding. However, due to the natural bias from the real wor
Externí odkaz:
http://arxiv.org/abs/2407.21438
As a fundamental and extensively studied task in computer vision, image segmentation aims to locate and identify different semantic concepts at the pixel level. Recently, inspired by In-Context Learning (ICL), several generalist segmentation framewor
Externí odkaz:
http://arxiv.org/abs/2407.10233
Vision-Language Instruction Tuning (VLIT) is a critical training phase for Large Vision-Language Models (LVLMs). With the improving capabilities of open-source LVLMs, researchers have increasingly turned to generate VLIT data by using open-source LVL
Externí odkaz:
http://arxiv.org/abs/2405.12752
Autor:
Li, Yukun, Pang, Guansong, Suo, Wei, Jing, Chenchen, Xi, Yuling, Liu, Lingqiao, Chen, Hao, Liang, Guoqiang, Wang, Peng
This paper explores the problem of continual learning (CL) of vision-language models (VLMs) in open domains, where the models need to perform continual updating and inference on a streaming of datasets from diverse seen and unseen domains with novel
Externí odkaz:
http://arxiv.org/abs/2403.10245
Autor:
Suo, Wei, Zhang, Weiwei
Numerical simulation is dominant in solving partial difference equations (PDEs), but balancing fine-grained grids with low computational costs is challenging. Recently, solving PDEs with neural networks (NNs) has gained interest, yet cost-effectivene
Externí odkaz:
http://arxiv.org/abs/2312.06949
Publikováno v:
International Journal of Numerical Methods for Heat & Fluid Flow, 2024, Vol. 34, Issue 9, pp. 3542-3568.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/HFF-01-2024-0019
VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language. Unlike traditional attention or gradient analysis, free-text rationales can be easier to understand and gain users' trust.
Externí odkaz:
http://arxiv.org/abs/2309.02155
Video captioning aims to understand the spatio-temporal semantic concept of the video and generate descriptive sentences. The de-facto approach to this task dictates a text generator to learn from \textit{offline-extracted} motion or appearance featu
Externí odkaz:
http://arxiv.org/abs/2205.03039
Referring Expression Comprehension (REC) has become one of the most important tasks in visual reasoning, since it is an essential step for many vision-and-language tasks such as visual question answering. However, it has not been widely used in many
Externí odkaz:
http://arxiv.org/abs/2105.02061