Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Kong, Xinhao"'
The complexity of large language model (LLM) serving workloads has substantially increased due to the integration with external tool invocations, such as ChatGPT plugins. In this paper, we identify a new opportunity for efficient LLM serving for requ
Externí odkaz:
http://arxiv.org/abs/2406.00059
Autor:
Kong, Xinhao, Zhu, Yibo, Zhou, Huaping, Jiang, Zhuo, Ye, Jianxi, Guo, Chuanxiong, Zhuo, Danyang
High-speed RDMA networks are getting rapidly adopted in the industry for their low latency and reduced CPU overheads. To verify that RDMA can be used in production, system administrators need to understand the set of application workloads that can po
Externí odkaz:
http://arxiv.org/abs/2304.11467
Autor:
Chen, Jingrong, Wu, Yongji, Lin, Shihan, Xu, Yechen, Kong, Xinhao, Anderson, Thomas, Lentz, Matthew, Yang, Xiaowei, Zhuo, Danyang
Remote Procedure Call (RPC) is a widely used abstraction for cloud computing. The programmer specifies type information for each remote procedure, and a compiler generates stub code linked into each application to marshal and unmarshal arguments into
Externí odkaz:
http://arxiv.org/abs/2304.07349