Zobrazeno 1 - 10
of 2 205
pro vyhledávání: '"An, Yufa"'
Autor:
Shen, Xuan, Song, Zhao, Zhou, Yufa, Chen, Bo, Li, Yanyu, Gong, Yifan, Zhang, Kai, Tan, Hao, Kuen, Jason, Ding, Henghui, Shu, Zhihao, Niu, Wei, Zhao, Pu, Wang, Yanzhi, Gu, Jiuxiang
Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as each denoisin
Externí odkaz:
http://arxiv.org/abs/2412.12444
Autor:
Shen, Xuan, Song, Zhao, Zhou, Yufa, Chen, Bo, Liu, Jing, Zhang, Ruiyi, Rossi, Ryan A., Tan, Hao, Yu, Tong, Chen, Xiang, Zhou, Yufan, Sun, Tong, Zhao, Pu, Wang, Yanzhi, Gu, Jiuxiang
Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse domains beyond language and image processing. However, their impressive performance often incurs high computational co
Externí odkaz:
http://arxiv.org/abs/2412.12441
Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants. However, their growing capabilities come at the cost of extremely large model sizes, making
Externí odkaz:
http://arxiv.org/abs/2410.11261
Large Language Models (LLMs) have demonstrated remarkable capabilities in processing long-context information. However, the quadratic complexity of attention computation with respect to sequence length poses significant computational challenges, and
Externí odkaz:
http://arxiv.org/abs/2410.09397
Previous work has demonstrated that attention mechanisms are Turing complete. More recently, it has been shown that a looped 13-layer Transformer can function as a universal programmable computer. In contrast, the multi-layer perceptrons with $\maths
Externí odkaz:
http://arxiv.org/abs/2410.09375
The computational complexity of the self-attention mechanism in popular transformer architectures poses significant challenges for training and inference, and becomes the bottleneck for long inputs. Is it possible to significantly reduce the quadrati
Externí odkaz:
http://arxiv.org/abs/2408.13233
Cross-attention has become a fundamental module nowadays in many important artificial intelligence applications, e.g., retrieval-augmented generation (RAG), system prompt, guided stable diffusion, and many more. Ensuring cross-attention privacy is cr
Externí odkaz:
http://arxiv.org/abs/2407.14717
Diffusion models have made rapid progress in generating high-quality samples across various domains. However, a theoretical understanding of the Lipschitz continuity and second momentum properties of the diffusion process is still lacking. In this pa
Externí odkaz:
http://arxiv.org/abs/2405.16418
Tensor Attention, a multi-view attention that is able to capture high-order correlations among multiple modalities, can overcome the representational limitations of classical matrix attention. However, the $O(n^3)$ time complexity of tensor attention
Externí odkaz:
http://arxiv.org/abs/2405.16411
Large language models (LLMs), especially those based on the Transformer architecture, have had a profound impact on various aspects of daily life, such as natural language processing, content generation, research methodologies, and more. Nevertheless
Externí odkaz:
http://arxiv.org/abs/2305.04701