Zobrazeno 1 - 10
of 1 603
pro vyhledávání: '"Liu Jiaying"'
Autor:
Ma, Yiyang, Liu, Xingchao, Chen, Xiaokang, Liu, Wen, Wu, Chengyue, Wu, Zhiyu, Pan, Zizheng, Xie, Zhenda, Zhang, Haowei, yu, Xingkai, Zhao, Liang, Wang, Yisong, Liu, Jiaying, Ruan, Chong
We present JanusFlow, a powerful framework that unifies image understanding and generation in a single model. JanusFlow introduces a minimalist architecture that integrates autoregressive language models with rectified flow, a state-of-the-art method
Externí odkaz:
http://arxiv.org/abs/2411.07975
Generative models, as a powerful technique for generation, also gradually become a critical tool for recognition tasks. However, in skeleton-based action recognition, the features obtained from existing pre-trained generative methods contain redundan
Externí odkaz:
http://arxiv.org/abs/2410.20349
Self-supervised learning has proved effective for skeleton-based human action understanding. However, previous works either rely on contrastive learning that suffers false negative problems or are based on reconstruction that learns too much unessent
Externí odkaz:
http://arxiv.org/abs/2409.10473
Autor:
Gao, Xiang, Liu, Jiaying
Large-scale text-to-image diffusion models have been a revolutionary milestone in the evolution of generative AI and multimodal technology, allowing wonderful image generation with natural-language text prompt. However, the issue of lacking controlla
Externí odkaz:
http://arxiv.org/abs/2408.00998
Artistic text generation aims to amplify the aesthetic qualities of text while maintaining readability. It can make the text more attractive and better convey its expression, thus enjoying a wide range of application scenarios such as social media di
Externí odkaz:
http://arxiv.org/abs/2407.14774
In real-world scenarios, human actions often fall into a long-tailed distribution. It makes the existing skeleton-based action recognition works, which are mostly designed based on balanced datasets, suffer from a sharp performance degradation. Recen
Externí odkaz:
http://arxiv.org/abs/2407.12312
Publikováno v:
Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(3), 1824-1832
Recently, large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing open-domain image translation via user-provided text prompts. This paper proposes frequency-controlled diffusion
Externí odkaz:
http://arxiv.org/abs/2407.03006
Coding, which targets compressing and reconstructing data, and intelligence, often regarded at an abstract computational level as being centered around model learning and prediction, interweave recently to give birth to a series of significant progre
Externí odkaz:
http://arxiv.org/abs/2407.01017
Autor:
Liu, Jiaying Lizzy, Wang, Yunlong, Lyu, Yao, Su, Yiheng, Niu, Shuo, Xu, Xuhai Orson, Zhang, Yan
Despite the growing interest in leveraging Large Language Models (LLMs) for content analysis, current studies have primarily focused on text-based content. In the present work, we explored the potential of LLMs in assisting video content analysis by
Externí odkaz:
http://arxiv.org/abs/2406.19528
Self-supervised learning (SSL), which aims to learn meaningful prior representations from unlabeled data, has been proven effective for skeleton-based action understanding. Different from the image domain, skeleton data possesses sparser spatial stru
Externí odkaz:
http://arxiv.org/abs/2406.02978