Zobrazeno 1 - 10
of 100
pro vyhledávání: '"Zhou, Yufa"'
Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants. However, their growing capabilities come at the cost of extremely large model sizes, making
Externí odkaz:
http://arxiv.org/abs/2410.11261
Large Language Models (LLMs) have demonstrated remarkable capabilities in processing long-context information. However, the quadratic complexity of attention computation with respect to sequence length poses significant computational challenges, and
Externí odkaz:
http://arxiv.org/abs/2410.09397
Previous work has demonstrated that attention mechanisms are Turing complete. More recently, it has been shown that a looped 13-layer Transformer can function as a universal programmable computer. In contrast, the multi-layer perceptrons with $\maths
Externí odkaz:
http://arxiv.org/abs/2410.09375
The computational complexity of the self-attention mechanism in popular transformer architectures poses significant challenges for training and inference, and becomes the bottleneck for long inputs. Is it possible to significantly reduce the quadrati
Externí odkaz:
http://arxiv.org/abs/2408.13233
Cross-attention has become a fundamental module nowadays in many important artificial intelligence applications, e.g., retrieval-augmented generation (RAG), system prompt, guided stable diffusion, and many more. Ensuring cross-attention privacy is cr
Externí odkaz:
http://arxiv.org/abs/2407.14717
Diffusion models have made rapid progress in generating high-quality samples across various domains. However, a theoretical understanding of the Lipschitz continuity and second momentum properties of the diffusion process is still lacking. In this pa
Externí odkaz:
http://arxiv.org/abs/2405.16418
Tensor Attention, a multi-view attention that is able to capture high-order correlations among multiple modalities, can overcome the representational limitations of classical matrix attention. However, the $O(n^3)$ time complexity of tensor attention
Externí odkaz:
http://arxiv.org/abs/2405.16411
Large language models (LLMs), especially those based on the Transformer architecture, have had a profound impact on various aspects of daily life, such as natural language processing, content generation, research methodologies, and more. Nevertheless
Externí odkaz:
http://arxiv.org/abs/2305.04701
Autor:
Liu, Yong *, Tan, Jie, Ngwayi, James Reeves Mbori, Zhuang, Xiaolin, Ding, Zhaohan, Chen, Yujie, Zhou, Yufa, Porter, Daniel Edward
Publikováno v:
In Journal of Surgical Education January 2024 81(1):76-83
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.