Zobrazeno 1 - 10
of 1 082
pro vyhledávání: '"Li, Shutao"'
The Agriculture-Vision Challenge at CVPR 2024 aims at leveraging semantic segmentation models to produce pixel level semantic segmentation labels within regions of interest for multi-modality satellite images. It is one of the most famous and competi
Externí odkaz:
http://arxiv.org/abs/2406.12271
Existing methods for long video understanding primarily focus on videos only lasting tens of seconds, with limited exploration of techniques for handling longer videos. The increased number of frames in longer videos presents two main challenges: dif
Externí odkaz:
http://arxiv.org/abs/2406.12846
Weakly-Supervised Semantic Segmentation (WSSS) aims to train segmentation models by weak labels, which is receiving significant attention due to its low annotation cost. Existing approaches focus on generating pseudo labels for supervision while larg
Externí odkaz:
http://arxiv.org/abs/2403.13225
Knowledge-based visual question answering (VQA) requires world knowledge beyond the image for accurate answer. Recently, instead of extra knowledge bases, a large language model (LLM) like GPT-3 is activated as an implicit knowledge engine to jointly
Externí odkaz:
http://arxiv.org/abs/2402.02503
Integrating a low-spatial-resolution hyperspectral image (LR-HSI) with a high-spatial-resolution multispectral image (HR-MSI) is recognized as a valid method for acquiring HR-HSI. Among the current fusion approaches, the tensor ring (TR) decompositio
Externí odkaz:
http://arxiv.org/abs/2310.10044
The integration of diverse visual prompts like clicks, scribbles, and boxes in interactive image segmentation could significantly facilitate user interaction as well as improve interaction efficiency. Most existing studies focus on a single type of v
Externí odkaz:
http://arxiv.org/abs/2306.06656
Autor:
Lin, Jiacheng, Chen, Jiajun, Yang, Kailun, Roitberg, Alina, Li, Siyu, Li, Zhiyong, Li, Shutao
Interactive Image Segmentation (IIS) has emerged as a promising technique for decreasing annotation time. Substantial progress has been made in pre- and post-processing for IIS, but the critical issue of interaction ambiguity, notably hindering segme
Externí odkaz:
http://arxiv.org/abs/2305.04276
Previous methods for dynamic facial expression recognition (DFER) in the wild are mainly based on Convolutional Neural Networks (CNNs), whose local operations ignore the long-range dependencies in videos. Transformer-based methods for DFER can achiev
Externí odkaz:
http://arxiv.org/abs/2305.03343