Zobrazeno 1 - 10
of 424
pro vyhledávání: '"Wang, Yaxiong"'
Recent advancements in image-text matching have been notable, yet prevailing models predominantly cater to broad queries and struggle with accommodating fine-grained query intention. In this paper, we work towards the \textbf{E}ntity-centric \textbf{
Externí odkaz:
http://arxiv.org/abs/2410.17810
Few-shot class-incremental learning (FSCIL) aims to incrementally recognize new classes using a few samples while maintaining the performance on previously learned classes. One of the effective methods to solve this challenge is to construct prototyp
Externí odkaz:
http://arxiv.org/abs/2409.11770
Composed Image Retrieval (CIR) involves searching for target images based on an image-text pair query. While current methods treat this as a query-target matching problem, we argue that CIR triplets contain additional associations beyond this primary
Externí odkaz:
http://arxiv.org/abs/2405.19149
Federated learning (FL) is a popular privacy-preserving paradigm that enables distributed clients to collaboratively train models with a central server while keeping raw data locally. In practice, distinct model architectures, varying data distributi
Externí odkaz:
http://arxiv.org/abs/2405.17267
Generalized Class Discovery (GCD) aims to dynamically assign labels to unlabelled data partially based on knowledge learned from labelled data, where the unlabelled data may come from known or novel classes. The prevailing approach generally involves
Externí odkaz:
http://arxiv.org/abs/2404.08995
This paper presents a Geometric-Photometric Joint Alignment(GPJA) method, for accurately aligning human expressions by combining geometry and photometric information. Common practices for registering human heads typically involve aligning landmarks w
Externí odkaz:
http://arxiv.org/abs/2403.02629
This paper targets to enhance the diffusion-based text-to-video generation by improving the two input prompts, including the noise and the text. Accommodated with this goal, we propose POS, a training-free Prompt Optimization Suite to boost text-to-v
Externí odkaz:
http://arxiv.org/abs/2311.00949
Composed image retrieval, a task involving the search for a target image using a reference image and a complementary text as the query, has witnessed significant advancements owing to the progress made in cross-modal modeling. Unlike the general imag
Externí odkaz:
http://arxiv.org/abs/2309.02169
We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions while preserving motion diversity.Despite the recent significant process in text-based human motion generation,existing methods o
Externí odkaz:
http://arxiv.org/abs/2309.01372
Few-shot class-incremental learning (FSCIL) aims to continually learn new classes using a few samples while not forgetting the old classes. The key of this task is effective knowledge transfer from the base session to the incremental sessions. Despit
Externí odkaz:
http://arxiv.org/abs/2306.10942