Zobrazeno 1 - 10
of 81
pro vyhledávání: '"Yu, Runyi"'
Autor:
Wang, Yinhuai, Zhao, Qihan, Yu, Runyi, Zeng, Ailing, Lin, Jing, Luo, Zhengyi, Tsui, Hok Wai, Yu, Jiwen, Li, Xiu, Chen, Qifeng, Zhang, Jian, Zhang, Lei, Tan, Ping
Mastering basketball skills such as diverse layups and dribbling involves complex interactions with the ball and requires real-time adjustments. Traditional reinforcement learning methods for interaction skills rely on labor-intensive, manually desig
Externí odkaz:
http://arxiv.org/abs/2408.15270
Autor:
Jin, Peng, Li, Hao, Cheng, Zesen, Li, Kehan, Yu, Runyi, Liu, Chang, Ji, Xiangyang, Yuan, Li, Chen, Jie
Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily focus on t
Externí odkaz:
http://arxiv.org/abs/2407.10528
Autor:
Yu, Runyi, He, Tianyu, Zhang, Ailing, Wang, Yuchi, Guo, Junliang, Tan, Xu, Liu, Chang, Chen, Jie, Bian, Jiang
We aim to edit the lip movements in talking video according to the given speech while preserving the personal identity and visual details. The task can be decomposed into two sub-problems: (1) speech-driven lip motion generation and (2) visual appear
Externí odkaz:
http://arxiv.org/abs/2406.08096
Autor:
Wang, Yuchi, Guo, Junliang, Bai, Jianhong, Yu, Runyi, He, Tianyu, Tan, Xu, Sun, Xu, Bian, Jiang
Recent talking avatar generation models have made strides in achieving realistic and accurate lip synchronization with the audio, but often fall short in controlling and conveying detailed expressions and emotions of the avatar, making the generated
Externí odkaz:
http://arxiv.org/abs/2405.15758
Autor:
He, Tianyu, Guo, Junliang, Yu, Runyi, Wang, Yuchi, Zhu, Jialiang, An, Kaikai, Li, Leyi, Tan, Xu, Wang, Chunyu, Hu, Han, Wu, HsiangTao, Zhao, Sheng, Bian, Jiang
Zero-shot talking avatar generation aims at synthesizing natural talking videos from speech and a single portrait image. Previous methods have relied on domain-specific heuristics such as warping-based motion representation and 3D Morphable Models, w
Externí odkaz:
http://arxiv.org/abs/2311.15230
Recently, using diffusion models for zero-shot image restoration (IR) has become a new hot paradigm. This type of method only needs to use the pre-trained off-the-shelf diffusion models, without any finetuning, and can directly handle various IR task
Externí odkaz:
http://arxiv.org/abs/2303.00354
Autor:
Yu, Runyi, Wang, Zhennan, Wang, Yinhuai, Li, Kehan, Zhao, Yian, Zhang, Jian, Song, Guoli, Chen, Jie
The Position Embedding (PE) is critical for Vision Transformers (VTs) due to the permutation-invariance of self-attention operation. By analyzing the input and output of each encoder layer in VTs using reparameterization and visualization, we find th
Externí odkaz:
http://arxiv.org/abs/2212.05262
Autor:
Li, Kehan, Wang, Zhennan, Cheng, Zesen, Yu, Runyi, Zhao, Yian, Song, Guoli, Liu, Chang, Yuan, Li, Chen, Jie
Recently, self-supervised large-scale visual pre-training models have shown great promise in representing pixel-level semantic relationships, significantly promoting the development of unsupervised dense prediction tasks, e.g., unsupervised semantic
Externí odkaz:
http://arxiv.org/abs/2210.05944
While the Vision Transformer (VT) architecture is becoming trendy in computer vision, pure VT models perform poorly on tiny datasets. To address this issue, this paper proposes the locality guidance for improving the performance of VTs on tiny datase
Externí odkaz:
http://arxiv.org/abs/2207.10026
Autor:
Wang, Zhennan, Li, Kehan, Yu, Runyi, Zhao, Yian, Qiao, Pengchong, Liu, Chang, Xu, Fan, Ji, Xiangyang, Song, Guoli, Chen, Jie
In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more disting
Externí odkaz:
http://arxiv.org/abs/2207.02625