Výsledky vyhledávání - "Wang, Chengjie"

Report

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

Autor: Ji, Xiaozhong, Lin, Chuming, Ding, Zhonggan, Tai, Ying, Yang, Jian, Zhu, Junwei, Hu, Xiaobin, Zhang, Jiangning, Luo, Donghao, Wang, Chengjie

Person-generic audio-driven face generation is a challenging task in computer vision. Previous methods have achieved remarkable progress in audio-visual synchronization, but there is still a significant gap between current results and practical appli

Externí odkaz: http://arxiv.org/abs/2406.18284

Zobrazit plný text záznamu

Report

DF40: Toward Next-Generation Deepfake Detection

Autor: Yan, Zhiyuan, Yao, Taiping, Chen, Shen, Zhao, Yandan, Fu, Xinghe, Zhu, Junwei, Luo, Donghao, Yuan, Li, Wang, Chengjie, Ding, Shouhong, Wu, Yunsheng

We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detec

Externí odkaz: http://arxiv.org/abs/2406.13495

Zobrazit plný text záznamu

Report

AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

Autor: Kong, Lingjie, Wu, Kai, Hu, Xiaobin, Han, Wenhui, Peng, Jinlong, Xu, Chengming, Luo, Donghao, Zhang, Jiangning, Wang, Chengjie, Fu, Yanwei

Text-to-image based object customization, aiming to generate images with the same identity (ID) as objects of interest in accordance with text prompts and reference images, has made significant progress. However, recent customizing research is domina

Externí odkaz: http://arxiv.org/abs/2406.11643

Zobrazit plný text záznamu

Report

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

Autor: Zhang, Jiangning, He, Haoyang, Gan, Zhenye, He, Qingdong, Cai, Yuxuan, Xue, Zhucun, Wang, Yabiao, Wang, Chengjie, Xie, Lei, Liu, Yong

Visual anomaly detection aims to identify anomalous regions in images through unsupervised learning paradigms, with increasing application demand and value in fields such as industrial inspection and medical lesion detection. Despite significant prog

Externí odkaz: http://arxiv.org/abs/2406.03262

Zobrazit plný text záznamu

Report

Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner

Autor: Nie, Qiang, Fu, Weifu, Lin, Yuhuan, Li, Jialin, Zhou, Yifeng, Liu, Yong, Zhu, Lei, Wang, Chengjie

Instance-incremental learning (IIL) focuses on learning continually with data of the same classes. Compared to class-incremental learning (CIL), the IIL is seldom explored because IIL suffers less from catastrophic forgetting (CF). However, besides r

Externí odkaz: http://arxiv.org/abs/2406.03065

Zobrazit plný text záznamu

Report

M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising

Autor: Wang, Chengjie, Zhu, Haokun, Peng, Jinlong, Wang, Yue, Yi, Ran, Wu, Yunsheng, Ma, Lizhuang, Zhang, Jiangning

Existing industrial anomaly detection methods primarily concentrate on unsupervised learning with pristine RGB images. Yet, both RGB and 3D data are crucial for anomaly detection, and the datasets are seldom completely clean in practical scenarios. T

Externí odkaz: http://arxiv.org/abs/2406.02263

Zobrazit plný text záznamu

Report

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Autor: Wu, Kai, Jiang, Boyuan, Jiang, Zhengkai, He, Qingdong, Luo, Donghao, Wang, Shengzhi, Liu, Qingwen, Wang, Chengjie

Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detail

Externí odkaz: http://arxiv.org/abs/2405.20081

Zobrazit plný text záznamu

Report

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

Autor: Wang, Qilin, Jiang, Zhengkai, Xu, Chengming, Zhang, Jiangning, Wang, Yabiao, Zhang, Xinyi, Cao, Yun, Cao, Weijian, Wang, Chengjie, Fu, Yanwei

Human image animation involves generating a video from a static image by following a specified pose sequence. Current approaches typically adopt a multi-stage pipeline that separately learns appearance and motion, which often leads to appearance degr

Externí odkaz: http://arxiv.org/abs/2405.18156

Zobrazit plný text záznamu

Report

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

Autor: Zhang, Sihe, He, Qingdong, Peng, Jinlong, Li, Yuxi, Jiang, Zhengkai, Wu, Jiafu, Chi, Mingmin, Wang, Yabiao, Wang, Chengjie

Image retrieval aims to identify visually similar images within a database using a given query image. Traditional methods typically employ both global and local features extracted from images for matching, and may also apply re-ranking techniques to

Externí odkaz: http://arxiv.org/abs/2405.17718

Zobrazit plný text záznamu

Report

PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

Autor: He, Qingdong, Zhang, Jiangning, Peng, Jinlong, He, Haoyang, Wang, Yabiao, Wang, Chengjie

Transformers have revolutionized the point cloud learning task, but the quadratic complexity hinders its extension to long sequence and makes a burden on limited computational resources. The recent advent of RWKV, a fresh breed of deep sequence model

Externí odkaz: http://arxiv.org/abs/2405.15214

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání