Výsledky vyhledávání - "Chen, Haoxing"

Report

Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training

Autor: Huang, Zizheng, Chen, Haoxing, Li, Jiaqi, Lan, Jun, Zhu, Huijia, Wang, Weiqiang, Wang, Limin

Recent Vision Mamba models not only have much lower complexity for processing higher resolution images and longer videos but also the competitive performance with Vision Transformers (ViTs). However, they are stuck into overfitting and thus only pres

Externí odkaz: http://arxiv.org/abs/2408.17081

Zobrazit plný text záznamu

Report

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Autor: Chen, Haoxing, Hong, Yan, Huang, Zizheng, Xu, Zhuoer, Gu, Zhangxuan, Li, Yaohui, Lan, Jun, Zhu, Huijia, Zhang, Jianfu, Wang, Weiqiang, Li, Huaxiong

Recently, video generation techniques have advanced rapidly. Given the popularity of video content on social media platforms, these models intensify concerns about the spread of fake information. Therefore, there is a growing demand for detectors cap

Externí odkaz: http://arxiv.org/abs/2405.19707

Zobrazit plný text záznamu

Report

Dual-Adapter: Training-free Dual Adaptation for Few-shot Out-of-Distribution Detection

Autor: Chen, Xinyi, Li, Yaohui, Chen, Haoxing

We study the problem of few-shot out-of-distribution (OOD) detection, which aims to detect OOD samples from unseen categories during inference time with only a few labeled in-domain (ID) samples. Existing methods mainly focus on training task-aware p

Externí odkaz: http://arxiv.org/abs/2405.16146

Zobrazit plný text záznamu

Report

Conditional Prototype Rectification Prompt Learning

Autor: Chen, Haoxing, Li, Yaohui, Huang, Zizheng, Hong, Yan, Xu, Zhuoer, Gu, Zhangxuan, Lan, Jun, Zhu, Huijia, Wang, Weiqiang

Pre-trained large-scale vision-language models (VLMs) have acquired profound understanding of general visual concepts. Recent advancements in efficient transfer learning (ETL) have shown remarkable success in fine-tuning VLMs within the scenario of l

Externí odkaz: http://arxiv.org/abs/2404.09872

Zobrazit plný text záznamu

Report

The Devil is in the Few Shots: Iterative Visual Knowledge Completion for Few-shot Learning

Autor: Li, Yaohui, Zhou, Qifeng, Chen, Haoxing, Zhang, Jianbing, Dai, Xinyu, Zhou, Hao

Contrastive Language-Image Pre-training (CLIP) has shown powerful zero-shot learning performance. Few-shot learning aims to further enhance the transfer capability of CLIP by giving few images in each class, aka 'few shots'. Most existing methods eit

Externí odkaz: http://arxiv.org/abs/2404.09778

Zobrazit plný text záznamu

Report

Segment Anything Model Meets Image Harmonization

Autor: Chen, Haoxing, Li, Yaohui, Gu, Zhangxuan, Xu, Zhuoer, Lan, Jun, Li, Huaxiong

Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images. Current methods adopt either global-level or pixel-level feature matching. Global-level feat

Externí odkaz: http://arxiv.org/abs/2312.12729

Zobrazit plný text záznamu

Report

Boosting Audio-visual Zero-shot Learning with Large Language Models

Autor: Chen, Haoxing, Li, Yaohui, Hong, Yan, Huang, Zizheng, Xu, Zhuoer, Gu, Zhangxuan, Lan, Jun, Zhu, Huijia, Wang, Weiqiang

Audio-visual zero-shot learning aims to recognize unseen classes based on paired audio-visual sequences. Recent methods mainly focus on learning multi-modal features aligned with class names to enhance the generalization ability to unseen categories.

Externí odkaz: http://arxiv.org/abs/2311.12268

Zobrazit plný text záznamu

Report

DiffUTE: Universal Text Editing Diffusion Model

Autor: Chen, Haoxing, Xu, Zhuoer, Gu, Zhangxuan, Lan, Jun, Zheng, Xing, Li, Yaohui, Meng, Changhua, Zhu, Huijia, Wang, Weiqiang

Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we propose a u

Externí odkaz: http://arxiv.org/abs/2305.10825

Zobrazit plný text záznamu

Report

Mobile User Interface Element Detection Via Adaptively Prompt Tuning

Autor: Gu, Zhangxuan, Xu, Zhuoer, Chen, Haoxing, Lan, Jun, Meng, Changhua, Wang, Weiqiang

Recent object detection approaches rely on pretrained vision-language models for image-text alignment. However, they fail to detect the Mobile User Interface (MUI) element since it contains additional OCR information, which describes its content and

Externí odkaz: http://arxiv.org/abs/2305.09699

Zobrazit plný text záznamu

Report

DiffusionInst: Diffusion Model for Instance Segmentation

Autor: Gu, Zhangxuan, Chen, Haoxing, Xu, Zhuoer, Lan, Jun, Meng, Changhua, Wang, Weiqiang

Diffusion frameworks have achieved comparable performance with previous state-of-the-art image generation models. Researchers are curious about its variants in discriminative tasks because of its powerful noise-to-image denoising pipeline. This paper

Externí odkaz: http://arxiv.org/abs/2212.02773

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání