A Review on Edge Large Language Models: Design, Execution, and Applications

Autor:	Zheng, Yue, Chen, Yuhao, Qian, Bin, Shi, Xiufang, Shu, Yuanchao, Chen, Jiming
Rok vydání:	2024
Předmět:	Computer Science - Distributed Parallel and Cluster Computing
Druh dokumentu:	Working Paper
Popis:	Large language models (LLMs) have revolutionized natural language processing with their exceptional capabilities. However, deploying LLMs on resource-constrained edge devices presents significant challenges due to computational limitations, memory constraints, and edge hardware heterogeneity. This survey summarizes recent developments in edge LLMs across their lifecycle, examining resource-efficient designs from pre-deployment techniques to runtime optimizations. Additionally, it explores on-device LLM applications in personal, enterprise, and industrial scenarios. By synthesizing advancements and identifying future directions, this survey aims to provide a comprehensive understanding of state-of-the-art methods for deploying LLMs on edge devices, bridging the gap between their immense potential and edge computing limitations.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2410.11845 Zobrazit plný text záznamu View this record from Arxiv