Výsledky vyhledávání

Report

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Autor: Shi, Shuwei, Gong, Biao, Chen, Xi, Zheng, Dandan, Tan, Shuai, Yang, Zizheng, Li, Yuyuan, He, Jingwen, Zheng, Kecheng, Chen, Jingdong, Yang, Ming, Zheng, Yinqiang

The image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal. These motion-aware models are appealing to generate diverse motion patterns, yet there l

Externí odkaz: http://arxiv.org/abs/2412.05848

Zobrazit plný text záznamu

Report

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Autor: Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming

Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r

Externí odkaz: http://arxiv.org/abs/2412.03085

Zobrazit plný text záznamu

Report

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Autor: Tan, Shuai, Gong, Biao, Wang, Xiang, Zhang, Shiwei, Zheng, Dandan, Zheng, Ruobing, Zheng, Kecheng, Chen, Jingdong, Yang, Ming

Character image animation, which generates high-quality videos from a reference image and target pose sequence, has seen significant progress in recent years. However, most existing methods only apply to human figures, which usually do not generalize

Externí odkaz: http://arxiv.org/abs/2410.10306

Zobrazit plný text záznamu

Report

EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

Autor: Tan, Shuai, Ji, Bin, Bi, Mengxiao, Pan, Ye

Achieving disentangled control over multiple facial motions and accommodating diverse input modalities greatly enhances the application and entertainment of the talking head generation. This necessitates a deep exploration of the decoupling space for

Externí odkaz: http://arxiv.org/abs/2404.01647

Zobrazit plný text záznamu

Report

FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization

Autor: Tan, Shuai, Ji, Bin, Pan, Ye

Generating emotional talking faces is a practical yet challenging endeavor. To create a lifelike avatar, we draw upon two critical insights from a human perspective: 1) The connection between audio and the non-deterministic facial dynamics, encompass

Externí odkaz: http://arxiv.org/abs/2403.06375

Zobrazit plný text záznamu

Report

Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style

Autor: Tan, Shuai, Ji, Bin, Pan, Ye

Although automatically animating audio-driven talking heads has recently received growing interest, previous efforts have mainly concentrated on achieving lip synchronization with the audio, neglecting two crucial elements for generating expressive v

Externí odkaz: http://arxiv.org/abs/2403.06365

Zobrazit plný text záznamu

Report

Say Anything with Any Style

Autor: Tan, Shuai, Ji, Bin, Ding, Yu, Pan, Ye

Generating stylized talking head with diverse head motions is crucial for achieving natural-looking videos but still remains challenging. Previous works either adopt a regressive method to capture the speaking style, resulting in a coarse style that

Externí odkaz: http://arxiv.org/abs/2403.06363

Zobrazit plný text záznamu

Report

Broadband squeezed light field by magnetostriction in an opto-magnomechanical

Autor: Di, Ke, Tan, Shuai, Cheng, Anyu, Zhao, Yinxue, Liu, Yu, Du, Jiajia

We present a novel mechanism for generating a wide bandwidth squeezed optical output field in an opto-magnomechanical system. In this system, the magnon (mechanical) mode in the yttrium-iron-garnet crystal is coupled to the microwave field (optical f

Externí odkaz: http://arxiv.org/abs/2402.04983

Zobrazit plný text záznamu

Report

Entanglement enhancement of two different magnon modes via nonlinear effect in cavity magnomechanics

Autor: Di, Ke, Wang, Xi, Tan, Shuai, Zhao, Yinxue, Liu, Yu, Cheng, Anyu, Du, Jiajia

We present a scheme to enhance two different magnon modes entanglement in cavity magnomechanics via nonlinear effect. The scheme demonstrated that nonlinear effects enhance entanglement of the two magnon modes. Moreover, the entanglement of the two m

Externí odkaz: http://arxiv.org/abs/2402.01150

Zobrazit plný text záznamu

Report

UKnow: A Unified Knowledge Protocol with Multimodal Knowledge Graph Datasets for Reasoning and Vision-Language Pre-Training

Autor: Gong, Biao, Tan, Shuai, Feng, Yutong, Xie, Xiaoying, Li, Yuyuan, Chen, Chaochao, Zheng, Kecheng, Shen, Yujun, Zhao, Deli

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namel

Externí odkaz: http://arxiv.org/abs/2302.06891

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání