Výsledky vyhledávání

Report

Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction

Autor: Xu, Guowei, Tao, Jiale, Li, Wen, Duan, Lixin

In the realm of stochastic human motion prediction (SHMP), researchers have often turned to generative models like GANS, VAEs and diffusion models. However, most previous approaches have struggled to accurately predict motions that are both realistic

Externí odkaz: http://arxiv.org/abs/2407.11494

Zobrazit plný text záznamu

Report

Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning

Autor: Wei, Fanyue, Zeng, Wei, Li, Zhenyang, Yin, Dawei, Duan, Lixin, Li, Wen

Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based generation model

Externí odkaz: http://arxiv.org/abs/2407.06642

Zobrazit plný text záznamu

Report

Beyond Viewpoint: Robust 3D Object Recognition under Arbitrary Views through Joint Multi-Part Representation

Autor: Fan, Linlong, Huang, Ye, Ge, Yanqi, Li, Wen, Duan, Lixin

Existing view-based methods excel at recognizing 3D objects from predefined viewpoints, but their exploration of recognition under arbitrary views is limited. This is a challenging and realistic setting because each object has different viewpoint pos

Externí odkaz: http://arxiv.org/abs/2407.03842

Zobrazit plný text záznamu

Report

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

Autor: Li, Xunsong, Sun, Pengzhan, Liu, Yangcen, Duan, Lixin, Li, Wen

The interactions between human and objects are important for recognizing object-centric actions. Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action

Externí odkaz: http://arxiv.org/abs/2404.11903

Zobrazit plný text záznamu

Report

Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer

Autor: Ge, Yanqi, Liu, Jiaqi, Fan, Qingnan, Jiang, Xi, Huang, Ye, Qin, Shuai, Gu, Hong, Li, Wen, Duan, Lixin

In this work, we target the task of text-driven style transfer in the context of text-to-image (T2I) diffusion models. The main challenge is consistent structure preservation while enabling effective style transfer effects. The past approaches in thi

Externí odkaz: http://arxiv.org/abs/2404.06835

Zobrazit plný text záznamu

Report

SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation

Autor: Ge, Yanqi, Huang, Ye, Li, Wen, Duan, Lixin

We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains. Specifically, given the fact that SAM is pre-trained with a large n

Externí odkaz: http://arxiv.org/abs/2401.14686

Zobrazit plný text záznamu

Report

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

Autor: Ge, Yanqi, Nie, Qiang, Huang, Ye, Liu, Yong, Wang, Chengjie, Zheng, Feng, Li, Wen, Duan, Lixin

One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes. Many outstanding metric-based and prototype-based methods following the Expectation-Maximization paradigm, have been

Externí odkaz: http://arxiv.org/abs/2312.11872

Zobrazit plný text záznamu

Report

Multi-modal Instance Refinement for Cross-domain Action Recognition

Autor: Qing, Yuan, Wu, Naixing, Wan, Shaohua, Duan, Lixin

Unsupervised cross-domain action recognition aims at adapting the model trained on an existing labeled source domain to a new unlabeled target domain. Most existing methods solve the task by directly aligning the feature distributions of source and t

Externí odkaz: http://arxiv.org/abs/2311.14281

Zobrazit plný text záznamu

Report

Learning Motion Refinement for Unsupervised Face Animation

Autor: Tao, Jiale, Gu, Shuhang, Li, Wen, Duan, Lixin

Unsupervised face animation aims to generate a human face video based on the appearance of a source image, mimicking the motion from a driving video. Existing methods typically adopted a prior-based motion model (e.g., the local affine motion model o

Externí odkaz: http://arxiv.org/abs/2310.13912

Zobrazit plný text záznamu

Report

HFGD: High-level Feature Guided Decoder for Semantic Segmentation

Autor: Huang, Ye, Kang, Di, Gao, Shenghua, Li, Wen, Duan, Lixin

Existing pyramid-based upsamplers (e.g. SemanticFPN), although efficient, usually produce less accurate results compared to dilation-based models when using the same backbone. This is partially caused by the contaminated high-level features since the

Externí odkaz: http://arxiv.org/abs/2303.08646

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání