Výsledky vyhledávání

Report

MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning

Autor: Liu, Xiaoyang, Mao, Yunyao, Zhou, Wengang, Li, Houqiang

We introduce MotionRL, the first approach to utilize Multi-Reward Reinforcement Learning (RL) for optimizing text-to-motion generation tasks and aligning them with human preferences. Previous works focused on improving numerical performance metrics o

Externí odkaz: http://arxiv.org/abs/2410.06513

Zobrazit plný text záznamu

Report

Hyper-Connections

Autor: Zhu, Defa, Huang, Hongzhi, Huang, Zihao, Zeng, Yutao, Mao, Yunyao, Wu, Banggu, Min, Qiyang, Zhou, Xun

We present hyper-connections, a simple yet effective method that can serve as an alternative to residual connections. This approach specifically addresses common drawbacks observed in residual connection variants, such as the seesaw effect between gr

Externí odkaz: http://arxiv.org/abs/2409.19606

Zobrazit plný text záznamu

Report

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Autor: Zhao, Weichao, Hu, Hezhen, Zhou, Wengang, Mao, Yunyao, Wang, Min, Li, Houqiang

Sign language recognition (SLR) has long been plagued by insufficient model representation capabilities. Although current pre-training approaches have alleviated this dilemma to some extent and yielded promising performance by employing various prete

Externí odkaz: http://arxiv.org/abs/2405.20666

Zobrazit plný text záznamu

Report

Learning Generalizable Human Motion Generator with Reinforcement Learning

Autor: Mao, Yunyao, Liu, Xiaoyang, Zhou, Wengang, Lu, Zhenbo, Li, Houqiang

Text-driven human motion generation, as one of the vital tasks in computer-aided content creation, has recently attracted increasing attention. While pioneering research has largely focused on improving numerical performance metrics on given datasets

Externí odkaz: http://arxiv.org/abs/2405.15541

Zobrazit plný text záznamu

Report

I$^2$MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation

Autor: Mao, Yunyao, Deng, Jiajun, Zhou, Wengang, Lu, Zhenbo, Ouyang, Wanli, Li, Houqiang

Recent progresses on self-supervised 3D human action representation learning are largely attributed to contrastive learning. However, in conventional contrastive frameworks, the rich complementarity between different skeleton modalities remains under

Externí odkaz: http://arxiv.org/abs/2310.15568

Zobrazit plný text záznamu

Report

Masked Motion Predictors are Strong 3D Action Representation Learners

Autor: Mao, Yunyao, Deng, Jiajun, Zhou, Wengang, Fang, Yao, Ouyang, Wanli, Li, Houqiang

In 3D human action recognition, limited supervised data makes it challenging to fully tap into the modeling potential of powerful networks such as transformers. As a result, researchers have been actively investigating effective self-supervised pre-t

Externí odkaz: http://arxiv.org/abs/2308.07092

Zobrazit plný text záznamu

Report

Detect Any Shadow: Segment Anything for Video Shadow Detection

Autor: Wang, Yonghui, Zhou, Wengang, Mao, Yunyao, Li, Houqiang

Segment anything model (SAM) has achieved great success in the field of natural image segmentation. Nevertheless, SAM tends to consider shadows as background and therefore does not perform segmentation on them. In this paper, we propose ShadowSAM, a

Externí odkaz: http://arxiv.org/abs/2305.16698

Zobrazit plný text záznamu

Report

CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation

Autor: Mao, Yunyao, Zhou, Wengang, Lu, Zhenbo, Deng, Jiajun, Li, Houqiang

In 3D action recognition, there exists rich complementary information between skeleton modalities. Nevertheless, how to model and utilize this information remains a challenging problem for self-supervised 3D action representation learning. In this wo

Externí odkaz: http://arxiv.org/abs/2208.12448

Zobrazit plný text záznamu

Report

Joint Inductive and Transductive Learning for Video Object Segmentation

Autor: Mao, Yunyao, Wang, Ning, Zhou, Wengang, Li, Houqiang

Semi-supervised video object segmentation is a task of segmenting the target object in a video sequence given only a mask annotation in the first frame. The limited information available makes it an extremely challenging task. Most previous best-perf

Externí odkaz: http://arxiv.org/abs/2108.03679

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání