Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Luo Donghao"'
Autor:
Ji, Xiaozhong, Hu, Xiaobin, Xu, Zhihong, Zhu, Junwei, Lin, Chuming, He, Qingdong, Zhang, Jiangning, Luo, Donghao, Chen, Yi, Lin, Qin, Lu, Qinglin, Wang, Chengjie
The study of talking face generation mainly explores the intricacies of synchronizing facial movements and crafting visually appealing, temporally-coherent animations. However, due to the limited exploration of global audio perception, current approa
Externí odkaz:
http://arxiv.org/abs/2411.16331
Autor:
Xu, Pengcheng, Jiang, Boyuan, Hu, Xiaobin, Luo, Donghao, He, Qingdong, Zhang, Jiangning, Wang, Chengjie, Wu, Yunsheng, Ling, Charles, Wang, Boyu
Leveraging the large generative prior of the flow transformer for tuning-free image editing requires authentic inversion to project the image into the model's domain and a flexible invariance control mechanism to preserve non-target contents. However
Externí odkaz:
http://arxiv.org/abs/2411.15843
Autor:
Jiang, Boyuan, Hu, Xiaobin, Luo, Donghao, He, Qingdong, Xu, Chengming, Peng, Jinlong, Zhang, Jiangning, Wang, Chengjie, Wu, Yunsheng, Fu, Yanwei
Although image-based virtual try-on has made considerable progress, emerging approaches still encounter challenges in producing high-fidelity and robust fitting images across diverse scenarios. These methods often struggle with issues such as texture
Externí odkaz:
http://arxiv.org/abs/2411.10499
Autor:
Liang, Yujie, Hu, Xiaobin, Jiang, Boyuan, Luo, Donghao, WU, Kai, Han, Wenhui, Jin, Taisong, Wang, Chengjie
Although diffusion-based image virtual try-on has made considerable progress, emerging approaches still struggle to effectively address the issue of hand occlusion (i.e., clothing regions occluded by the hand part), leading to a notable degradation o
Externí odkaz:
http://arxiv.org/abs/2408.12340
Autor:
Li, Bang, Luo, Donghao, Liang, Yujie, Yang, Jing, Ding, Zengmao, Peng, Xu, Jiang, Boyuan, Han, Shengwei, Sui, Dan, Qin, Peichao, Wu, Pian, Wang, Chaoyang, Qi, Yun, Jin, Taisong, Wang, Chengjie, Huang, Xiaoming, Shu, Zhan, Ji, Rongrong, Liu, Yongge, Wu, Yunsheng
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can
Externí odkaz:
http://arxiv.org/abs/2407.03900
Autor:
Ji, Xiaozhong, Lin, Chuming, Ding, Zhonggan, Tai, Ying, Zhu, Junwei, Hu, Xiaobin, Luo, Donghao, Ge, Yanhao, Wang, Chengjie
Person-generic audio-driven face generation is a challenging task in computer vision. Previous methods have achieved remarkable progress in audio-visual synchronization, but there is still a significant gap between current results and practical appli
Externí odkaz:
http://arxiv.org/abs/2406.18284
Autor:
Yan, Zhiyuan, Yao, Taiping, Chen, Shen, Zhao, Yandan, Fu, Xinghe, Zhu, Junwei, Luo, Donghao, Wang, Chengjie, Ding, Shouhong, Wu, Yunsheng, Yuan, Li
We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detec
Externí odkaz:
http://arxiv.org/abs/2406.13495
Autor:
Kong, Lingjie, Wu, Kai, Hu, Xiaobin, Han, Wenhui, Peng, Jinlong, Xu, Chengming, Luo, Donghao, Li, Mengtian, Zhang, Jiangning, Wang, Chengjie, Fu, Yanwei
Recent advances in diffusion-based text-to-image models have simplified creating high-fidelity images, but preserving the identity (ID) of specific elements, like a personal dog, is still challenging. Object customization, using reference images and
Externí odkaz:
http://arxiv.org/abs/2406.11643
Autor:
Wu, Kai, Jiang, Boyuan, Jiang, Zhengkai, He, Qingdong, Luo, Donghao, Wang, Shengzhi, Liu, Qingwen, Wang, Chengjie
Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detail
Externí odkaz:
http://arxiv.org/abs/2405.20081
Autor:
Xu, Chengming, Hu, Kai, Wang, Qilin, Luo, Donghao, Zhang, Jiangning, Hu, Xiaobin, Fu, Yanwei, Wang, Chengjie
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. In this paper, we present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion (SD) to address challenges such as misint
Externí odkaz:
http://arxiv.org/abs/2405.15287