Výsledky vyhledávání

Report

StableAnimator: High-Quality Identity-Preserving Human Image Animation

Autor: Tu, Shuyuan, Xing, Zhen, Han, Xintong, Cheng, Zhi-Qi, Dai, Qi, Luo, Chong, Wu, Zuxuan

Current diffusion models for human image animation struggle to ensure identity (ID) consistency. This paper presents StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-

Externí odkaz: http://arxiv.org/abs/2411.17697

Zobrazit plný text záznamu

Report

MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference

Autor: Huang, Jiancheng, Gao, Yu, Jie, Zequn, Zhong, Yujie, Han, Xintong, Ma, Lin

In this paper, we introduce MRStyle, a comprehensive framework that enables color style transfer using multi-modality reference, including image and text. To achieve a unified style feature space for both modalities, we first develop a neural network

Externí odkaz: http://arxiv.org/abs/2409.05250

Zobrazit plný text záznamu

Report

MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion

Autor: Tu, Shuyuan, Dai, Qi, Zhang, Zihao, Xie, Sicheng, Cheng, Zhi-Qi, Luo, Chong, Han, Xintong, Wu, Zuxuan, Jiang, Yu-Gang

Despite impressive advancements in diffusion-based video editing models in altering video attributes, there has been limited exploration into modifying motion information while preserving the original protagonist's appearance and background. In this

Externí odkaz: http://arxiv.org/abs/2405.20325

Zobrazit plný text záznamu

Report

CoverHunter: Cover Song Identification with Refined Attention and Alignments

Autor: Liu, Feng, Tuo, Deyi, Xu, Yinan, Han, Xintong

Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track. In this paper, we propose a novel system named CoverHunter that overcomes the shortcomings of existing detec

Externí odkaz: http://arxiv.org/abs/2306.09025

Zobrazit plný text záznamu

Report

XFormer: Fast and Accurate Monocular 3D Body Capture

Autor: Qian, Lihui, Han, Xintong, Wang, Faqiang, Liu, Hongyu, Dong, Haoye, Li, Zhiwen, Wei, Huawei, Lin, Zhe, Jin, Cheng-Bin

We present XFormer, a novel human mesh and motion capture method that achieves real-time performance on consumer CPUs given only monocular images as input. The proposed network architecture contains two branches: a keypoint branch that estimates 3D h

Externí odkaz: http://arxiv.org/abs/2305.11101

Zobrazit plný text záznamu

Report

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

Autor: Chen, Haoran, Wu, Zuxuan, Han, Xintong, Jia, Menglin, Jiang, Yu-Gang

Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks. Such a trade-off is referred to as the stability-plasticity dile

Externí odkaz: http://arxiv.org/abs/2303.07223

Zobrazit plný text záznamu

Report

Human MotionFormer: Transferring Human Motions with Vision Transformers

Autor: Liu, Hongyu, Han, Xintong, Jin, Chengbin, Qian, Lihui, Wei, Huawei, Lin, Zhe, Wang, Faqiang, Dong, Haoye, Song, Yibing, Xu, Jia, Chen, Qifeng

Human motion transfer aims to transfer motions from a target dynamic person to a source static one for motion synthesis. An accurate matching between the source person and the target motion in both large and subtle motion changes is vital for improvi

Externí odkaz: http://arxiv.org/abs/2302.11306

Zobrazit plný text záznamu

Report

One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations

Autor: Zhu, Yiming, Liu, Hongyu, Song, Yibing, Yuan, ziyang, Han, Xintong, Yuan, Chun, Chen, Qifeng, Wang, Jue

Free-form text prompts allow users to describe their intentions during image manipulation conveniently. Based on the visual latent space of StyleGAN[21] and text embedding space of CLIP[34], studies focus on how to map these two latent spaces for tex

Externí odkaz: http://arxiv.org/abs/2210.07883

Zobrazit plný text záznamu

Report

Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation

Autor: Chen, Haoran, Han, Xintong, Wu, Zuxuan, Jiang, Yu-Gang

Most existing methods for unsupervised domain adaptation (UDA) rely on a shared network to extract domain-invariant features. However, when facing multiple source domains, optimizing such a network involves updating the parameters of the entire netwo

Externí odkaz: http://arxiv.org/abs/2209.15210

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání