Výsledky vyhledávání

Report

Asymmetric Reinforcing against Multi-modal Representation Bias

Autor: Gao, Xiyuan, Cao, Bing, Zhu, Pengfei, Wang, Nannan, Hu, Qinghua

The strength of multimodal learning lies in its ability to integrate information from various sources, providing rich and comprehensive insights. However, in real-world scenarios, multi-modal systems often face the challenge of dynamic modality contr

Externí odkaz: http://arxiv.org/abs/2501.01240

Zobrazit plný text záznamu

Report

$S^3$: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models

Autor: Yin, Xiaojie, Wang, Qilong, Cao, Bing, Hu, Qinghua

Recently, many studies have been conducted to enhance the zero-shot generalization ability of vision-language models (e.g., CLIP) by addressing the semantic misalignment between image and text embeddings in downstream tasks. Although many efforts hav

Externí odkaz: http://arxiv.org/abs/2412.04925

Zobrazit plný text záznamu

Report

Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark

Autor: Cao, Bing, Lu, Quanhao, Feng, Jiekang, Zhu, Pengfei, Hu, Qinghua, Wang, Qilong

The dynamic imbalance of the fore-background is a major challenge in video object counting, which is usually caused by the sparsity of foreground objects. This often leads to severe under- and over-prediction problems and has been less studied in exi

Externí odkaz: http://arxiv.org/abs/2411.13056

Zobrazit plný text záznamu

Report

Dynamic Brightness Adaptation for Robust Multi-modal Image Fusion

Autor: Sun, Yiming, Cao, Bing, Zhu, Pengfei, Hu, Qinghua

Publikováno v: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence,Main Track,Pages 1317-1325, 2024

Infrared and visible image fusion aim to integrate modality strengths for visually enhanced, informative images. Visible imaging in real-world scenarios is susceptible to dynamic environmental brightness fluctuations, leading to texture degradation.

Externí odkaz: http://arxiv.org/abs/2411.04697

Zobrazit plný text záznamu

Report

Test-Time Dynamic Image Fusion

Autor: Cao, Bing, Xia, Yinan, Ding, Yi, Zhang, Changqing, Hu, Qinghua

The inherent challenge of image fusion lies in capturing the correlation of multi-source images and comprehensively integrating effective information from different sources. Most existing techniques fail to perform dynamic image fusion while notably

Externí odkaz: http://arxiv.org/abs/2411.02840

Zobrazit plný text záznamu

Report

Conditional Controllable Image Fusion

Autor: Cao, Bing, Xu, Xingxin, Zhu, Pengfei, Wang, Qilong, Hu, Qinghua

Image fusion aims to integrate complementary information from multiple input images acquired through various sources to synthesize a new fused image. Existing methods usually employ distinct constraint designs tailored to specific scenes, forming fix

Externí odkaz: http://arxiv.org/abs/2411.01573

Zobrazit plný text záznamu

Report

Predictive Dynamic Fusion

Autor: Cao, Bing, Xia, Yinan, Ding, Yi, Zhang, Changqing, Hu, Qinghua

Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most ex

Externí odkaz: http://arxiv.org/abs/2406.04802

Zobrazit plný text záznamu

Report

Visible and Clear: Finding Tiny Objects in Difference Map

Autor: Cao, Bing, Yao, Haiyu, Zhu, Pengfei, Hu, Qinghua

Tiny object detection is one of the key challenges in the field of object detection. The performance of most generic detectors dramatically decreases in tiny object detection tasks. The main challenge lies in extracting effective features of tiny obj

Externí odkaz: http://arxiv.org/abs/2405.11276

Zobrazit plný text záznamu

Report

CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models

Autor: Wang, Qinghe, Li, Baolu, Li, Xiaomin, Cao, Bing, Ma, Liqian, Lu, Huchuan, Jia, Xu

Recent advances in text-to-image models have opened new frontiers in human-centric generation. However, these models cannot be directly employed to generate images with consistent newly coined identities. In this work, we propose CharacterFactory, a

Externí odkaz: http://arxiv.org/abs/2404.15677

Zobrazit plný text záznamu

Report

Task-Customized Mixture of Adapters for General Image Fusion

Autor: Zhu, Pengfei, Sun, Yang, Cao, Bing, Hu, Qinghua

General image fusion aims at integrating important information from multi-source images. However, due to the significant cross-task gap, the respective fusion mechanism varies considerably in practice, resulting in limited performance across subtasks

Externí odkaz: http://arxiv.org/abs/2403.12494

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání