Zobrazeno 1 - 10
of 2 601
pro vyhledávání: '"qian qi"'
Multiple clustering aims to discover various latent structures of data from different aspects. Deep multiple clustering methods have achieved remarkable performance by exploiting complex patterns and relationships in data. However, existing works str
Externí odkaz:
http://arxiv.org/abs/2411.03978
Diffusion models demonstrate impressive image generation performance with text guidance. Inspired by the learning process of diffusion, existing images can be edited according to text by DDIM inversion. However, the vanilla DDIM inversion is not opti
Externí odkaz:
http://arxiv.org/abs/2409.10476
In many real-world applications, the frequency distribution of class labels for training data can exhibit a long-tailed distribution, which challenges traditional approaches of training deep neural networks that require heavy amounts of balanced data
Externí odkaz:
http://arxiv.org/abs/2409.03583
Deep features extracted from certain layers of a pre-trained deep model show superior performance over the conventional hand-crafted features. Compared with fine-tuning or linear probing that can explore diverse augmentations, \eg, random crop/flippi
Externí odkaz:
http://arxiv.org/abs/2408.13351
Vision-language pre-training such as CLIP enables zero-shot transfer that can classify images according to the candidate class names. While CLIP demonstrates an impressive zero-shot performance on diverse downstream tasks, the distribution from the t
Externí odkaz:
http://arxiv.org/abs/2408.13320
Autor:
Ye, Jiabo, Xu, Haiyang, Liu, Haowei, Hu, Anwen, Yan, Ming, Qian, Qi, Zhang, Ji, Huang, Fei, Zhou, Jingren
Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities in executing instructions for a variety of single-image tasks. Despite this progress, significant challenges remain in modeling long image sequences. In this work, we
Externí odkaz:
http://arxiv.org/abs/2408.04840
Autor:
Wang, Xiaohua, Wang, Zhenghua, Gao, Xuan, Zhang, Feiran, Wu, Yixin, Xu, Zhibo, Shi, Tianyuan, Wang, Zhengyuan, Li, Shizheng, Qian, Qi, Yin, Ruicheng, Lv, Changze, Zheng, Xiaoqing, Huang, Xuanjing
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been pro
Externí odkaz:
http://arxiv.org/abs/2407.01219
Personalized text-to-image generation has attracted unprecedented attention in the recent few years due to its unique capability of generating highly-personalized images via using the input concept dataset and novel textual prompt. However, previous
Externí odkaz:
http://arxiv.org/abs/2407.00608
Multiple clustering has gained significant attention in recent years due to its potential to reveal multiple hidden structures of data from different perspectives. The advent of deep multiple clustering techniques has notably advanced the performance
Externí odkaz:
http://arxiv.org/abs/2404.15655
Autor:
Hu, Anwen, Shi, Yaya, Xu, Haiyang, Ye, Jiabo, Ye, Qinghao, Yan, Ming, Li, Chenliang, Qian, Qi, Zhang, Ji, Huang, Fei
Recently, the strong text creation ability of Large Language Models(LLMs) has given rise to many tools for assisting paper reading or even writing. However, the weak diagram analysis abilities of LLMs or Multimodal LLMs greatly limit their applicatio
Externí odkaz:
http://arxiv.org/abs/2311.18248