Zobrazeno 1 - 10
of 1 008
pro vyhledávání: '"TAN Shuai"'
Autor:
Shi, Shuwei, Gong, Biao, Chen, Xi, Zheng, Dandan, Tan, Shuai, Yang, Zizheng, Li, Yuyuan, He, Jingwen, Zheng, Kecheng, Chen, Jingdong, Yang, Ming, Zheng, Yinqiang
The image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal. These motion-aware models are appealing to generate diverse motion patterns, yet there l
Externí odkaz:
http://arxiv.org/abs/2412.05848
Autor:
Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming
Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r
Externí odkaz:
http://arxiv.org/abs/2412.03085
Autor:
Tan, Shuai, Gong, Biao, Wang, Xiang, Zhang, Shiwei, Zheng, Dandan, Zheng, Ruobing, Zheng, Kecheng, Chen, Jingdong, Yang, Ming
Character image animation, which generates high-quality videos from a reference image and target pose sequence, has seen significant progress in recent years. However, most existing methods only apply to human figures, which usually do not generalize
Externí odkaz:
http://arxiv.org/abs/2410.10306
Achieving disentangled control over multiple facial motions and accommodating diverse input modalities greatly enhances the application and entertainment of the talking head generation. This necessitates a deep exploration of the decoupling space for
Externí odkaz:
http://arxiv.org/abs/2404.01647
Generating emotional talking faces is a practical yet challenging endeavor. To create a lifelike avatar, we draw upon two critical insights from a human perspective: 1) The connection between audio and the non-deterministic facial dynamics, encompass
Externí odkaz:
http://arxiv.org/abs/2403.06375
Although automatically animating audio-driven talking heads has recently received growing interest, previous efforts have mainly concentrated on achieving lip synchronization with the audio, neglecting two crucial elements for generating expressive v
Externí odkaz:
http://arxiv.org/abs/2403.06365
Generating stylized talking head with diverse head motions is crucial for achieving natural-looking videos but still remains challenging. Previous works either adopt a regressive method to capture the speaking style, resulting in a coarse style that
Externí odkaz:
http://arxiv.org/abs/2403.06363
We present a novel mechanism for generating a wide bandwidth squeezed optical output field in an opto-magnomechanical system. In this system, the magnon (mechanical) mode in the yttrium-iron-garnet crystal is coupled to the microwave field (optical f
Externí odkaz:
http://arxiv.org/abs/2402.04983
Entanglement enhancement of two different magnon modes via nonlinear effect in cavity magnomechanics
We present a scheme to enhance two different magnon modes entanglement in cavity magnomechanics via nonlinear effect. The scheme demonstrated that nonlinear effects enhance entanglement of the two magnon modes. Moreover, the entanglement of the two m
Externí odkaz:
http://arxiv.org/abs/2402.01150
Autor:
Gong, Biao, Tan, Shuai, Feng, Yutong, Xie, Xiaoying, Li, Yuyuan, Chen, Chaochao, Zheng, Kecheng, Shen, Yujun, Zhao, Deli
This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namel
Externí odkaz:
http://arxiv.org/abs/2302.06891