Výsledky vyhledávání - "Cun, Xiaodong"

Report

ZeroPur: Succinct Training-Free Adversarial Purification

Autor: Bi, Xiuli, Yang, Zonglin, Liu, Bo, Cun, Xiaodong, Pun, Chi-Man, Lio, Pietro, Xiao, Bin

Adversarial purification is a kind of defense technique that can defend various unseen adversarial attacks without modifying the victim classifier. Existing methods often depend on external generative models or cooperation between auxiliary functions

Externí odkaz: http://arxiv.org/abs/2406.03143

Zobrazit plný text záznamu

Report

ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

Autor: Yang, Shaoshu, Zhang, Yong, Cun, Xiaodong, Shan, Ying, He, Ran

Video generation has made remarkable progress in recent years, especially since the advent of the video diffusion models. Many video generation models can produce plausible synthetic videos, e.g., Stable Video Diffusion (SVD). However, most video mod

Externí odkaz: http://arxiv.org/abs/2406.00908

Zobrazit plný text záznamu

Report

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Autor: Zhao, Sijie, Zhang, Yong, Cun, Xiaodong, Yang, Shaoshu, Niu, Muyao, Li, Xiaoyu, Hu, Wenbo, Shan, Ying

Spatio-temporal compression of videos, utilizing networks such as Variational Autoencoders (VAE), plays a crucial role in OpenAI's SORA and numerous other video generative models. For instance, many LLM-like video models learn the distribution of dis

Externí odkaz: http://arxiv.org/abs/2405.20279

Zobrazit plný text záznamu

Report

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Autor: Niu, Muyao, Cun, Xiaodong, Wang, Xintao, Zhang, Yong, Shan, Ying, Zheng, Yinqiang

We present MOFA-Video, an advanced controllable image animation method that generates video from the given image using various additional controllable signals (such as human landmarks reference, manual trajectories, and another even provided video) o

Externí odkaz: http://arxiv.org/abs/2405.20222

Zobrazit plný text záznamu

Report

Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Autor: Huang, Ziyao, Tang, Fan, Zhang, Yong, Cun, Xiaodong, Cao, Juan, Li, Jintao, Lee, Tong-Yee

Despite the remarkable process of talking-head-based avatar-creating solutions, directly generating anchor-style videos with full-body motions remains challenging. In this study, we propose Make-Your-Anchor, a novel system necessitating only a one-mi

Externí odkaz: http://arxiv.org/abs/2403.16510

Zobrazit plný text záznamu

Report

Depth-aware Test-Time Training for Zero-shot Video Object Segmentation

Autor: Liu, Weihuang, Shen, Xi, Li, Haolun, Bi, Xiuli, Liu, Bo, Pun, Chi-Man, Cun, Xiaodong

Zero-shot Video Object Segmentation (ZSVOS) aims at segmenting the primary moving object without any human annotations. Mainstream solutions mainly focus on learning a single model on large-scale video datasets, which struggle to generalize to unseen

Externí odkaz: http://arxiv.org/abs/2403.04258

Zobrazit plný text záznamu

Report

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Autor: Guo, Lanqing, He, Yingqing, Chen, Haoxin, Xia, Menghan, Cun, Xiaodong, Wang, Yufei, Huang, Siyu, Zhang, Yong, Wang, Xintao, Chen, Qifeng, Shan, Ying, Wen, Bihan

Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data. Adapting large pre-trained diffusion models

Externí odkaz: http://arxiv.org/abs/2402.10491

Zobrazit plný text záznamu

Report

SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models

Autor: Huang, Yuzhou, Xie, Liangbin, Wang, Xintao, Yuan, Ziyang, Cun, Xiaodong, Ge, Yixiao, Zhou, Jiantao, Dong, Chao, Huang, Rui, Zhang, Ruimao, Shan, Ying

Current instruction-based editing methods, such as InstructPix2Pix, often fail to produce satisfactory results in complex scenarios due to their dependence on the simple CLIP text encoder in diffusion models. To rectify this, this paper introduces Sm

Externí odkaz: http://arxiv.org/abs/2312.06739

Zobrazit plný text záznamu

Report

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

Autor: Yu, Jiwen, Cun, Xiaodong, Qi, Chenyang, Zhang, Yong, Wang, Xintao, Shan, Ying, Zhang, Jian

Large-scale text-to-video (T2V) diffusion models have great progress in recent years in terms of visual quality, motion and temporal consistency. However, the generation process is still a black box, where all attributes (e.g., appearance, motion) ar

Externí odkaz: http://arxiv.org/abs/2312.03793

Zobrazit plný text záznamu

Report

MagicStick: Controllable Video Editing via Control Handle Transformations

Autor: Ma, Yue, Cun, Xiaodong, He, Yingqing, Qi, Chenyang, Wang, Xintao, Shan, Ying, Li, Xiu, Chen, Qifeng

Text-based video editing has recently attracted considerable interest in changing the style or replacing the objects with a similar structure. Beyond this, we demonstrate that properties such as shape, size, location, motion, etc., can also be edited

Externí odkaz: http://arxiv.org/abs/2312.03047

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání