Zobrazeno 1 - 10
of 85
pro vyhledávání: '"Cun, Xiaodong"'
Adversarial purification is a kind of defense technique that can defend various unseen adversarial attacks without modifying the victim classifier. Existing methods often depend on external generative models or cooperation between auxiliary functions
Externí odkaz:
http://arxiv.org/abs/2406.03143
Video generation has made remarkable progress in recent years, especially since the advent of the video diffusion models. Many video generation models can produce plausible synthetic videos, e.g., Stable Video Diffusion (SVD). However, most video mod
Externí odkaz:
http://arxiv.org/abs/2406.00908
Autor:
Zhao, Sijie, Zhang, Yong, Cun, Xiaodong, Yang, Shaoshu, Niu, Muyao, Li, Xiaoyu, Hu, Wenbo, Shan, Ying
Spatio-temporal compression of videos, utilizing networks such as Variational Autoencoders (VAE), plays a crucial role in OpenAI's SORA and numerous other video generative models. For instance, many LLM-like video models learn the distribution of dis
Externí odkaz:
http://arxiv.org/abs/2405.20279
We present MOFA-Video, an advanced controllable image animation method that generates video from the given image using various additional controllable signals (such as human landmarks reference, manual trajectories, and another even provided video) o
Externí odkaz:
http://arxiv.org/abs/2405.20222
Despite the remarkable process of talking-head-based avatar-creating solutions, directly generating anchor-style videos with full-body motions remains challenging. In this study, we propose Make-Your-Anchor, a novel system necessitating only a one-mi
Externí odkaz:
http://arxiv.org/abs/2403.16510
Zero-shot Video Object Segmentation (ZSVOS) aims at segmenting the primary moving object without any human annotations. Mainstream solutions mainly focus on learning a single model on large-scale video datasets, which struggle to generalize to unseen
Externí odkaz:
http://arxiv.org/abs/2403.04258
Autor:
Guo, Lanqing, He, Yingqing, Chen, Haoxin, Xia, Menghan, Cun, Xiaodong, Wang, Yufei, Huang, Siyu, Zhang, Yong, Wang, Xintao, Chen, Qifeng, Shan, Ying, Wen, Bihan
Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data. Adapting large pre-trained diffusion models
Externí odkaz:
http://arxiv.org/abs/2402.10491
Autor:
Huang, Yuzhou, Xie, Liangbin, Wang, Xintao, Yuan, Ziyang, Cun, Xiaodong, Ge, Yixiao, Zhou, Jiantao, Dong, Chao, Huang, Rui, Zhang, Ruimao, Shan, Ying
Current instruction-based editing methods, such as InstructPix2Pix, often fail to produce satisfactory results in complex scenarios due to their dependence on the simple CLIP text encoder in diffusion models. To rectify this, this paper introduces Sm
Externí odkaz:
http://arxiv.org/abs/2312.06739
Large-scale text-to-video (T2V) diffusion models have great progress in recent years in terms of visual quality, motion and temporal consistency. However, the generation process is still a black box, where all attributes (e.g., appearance, motion) ar
Externí odkaz:
http://arxiv.org/abs/2312.03793
Autor:
Ma, Yue, Cun, Xiaodong, He, Yingqing, Qi, Chenyang, Wang, Xintao, Shan, Ying, Li, Xiu, Chen, Qifeng
Text-based video editing has recently attracted considerable interest in changing the style or replacing the objects with a similar structure. Beyond this, we demonstrate that properties such as shape, size, location, motion, etc., can also be edited
Externí odkaz:
http://arxiv.org/abs/2312.03047