Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Pan, Xiurui"'
Autor:
Pan, Xiurui, Li, Endian, Li, Qiao, Liang, Shengwen, Shan, Yizhou, Zhou, Ke, Luo, Yingwei, Wang, Xiaolin, Zhang, Jie
The widespread of Large Language Models (LLMs) marks a significant milestone in generative AI. Nevertheless, the increasing context length and batch size in offline LLM inference escalate the memory requirement of the key-value (KV) cache, which impo
Externí odkaz:
http://arxiv.org/abs/2409.04992
The storage stack in the traditional operating system is primarily optimized towards improving the CPU utilization and hiding the long I/O latency imposed by the slow I/O devices such as hard disk drivers (HDDs). However, the emerging storage media e
Externí odkaz:
http://arxiv.org/abs/2306.10503