Zobrazeno 1 - 10
of 103
pro vyhledávání: '"Yu, Minlan"'
Autor:
Li, Minghao, Avdiukhin, Dmitrii, Shahout, Rana, Ivkin, Nikita, Braverman, Vladimir, Yu, Minlan
Federated Learning (FL) enables deep learning model training across edge devices and protects user privacy by retaining raw data locally. Data heterogeneity in client distributions slows model convergence and leads to plateauing with reduced precisio
Externí odkaz:
http://arxiv.org/abs/2411.01580
Autor:
Deng, Yangtao, Shi, Xiang, Jiang, Zhuo, Zhang, Xingjian, Zhang, Lei, Zhang, Zhang, Li, Bo, Song, Zuquan, Zhu, Hang, Liu, Gaohong, Li, Fuliang, Wang, Shuguang, Lin, Haibin, Ye, Jianxi, Yu, Minlan
Large-scale distributed model training requires simultaneous training on up to thousands of machines. Faulty machine detection is critical when an unexpected fault occurs in a machine. From our experience, a training task can encounter two faults per
Externí odkaz:
http://arxiv.org/abs/2411.01791
Online LLM inference powers many exciting applications such as intelligent chatbots and autonomous agents. Modern LLM inference engines widely rely on request batching to improve inference throughput, aiming to make it cost-efficient when running on
Externí odkaz:
http://arxiv.org/abs/2411.01142
Autor:
Xi, Shaoke, Gao, Jiaqi, Liu, Mengqi, Cao, Jiamin, Li, Fuliang, Bu, Kai, Ren, Kui, Yu, Minlan, Cai, Dennis, Zhai, Ennan
With the growing performance requirements on networked applications, there is a new trend of offloading stateful network applications to SmartNICs to improve performance and reduce the total cost of ownership. However, offloading stateful network app
Externí odkaz:
http://arxiv.org/abs/2410.22229
Autor:
Shahout, Rana, Liang, Cong, Xin, Shiji, Lao, Qianru, Cui, Yong, Yu, Minlan, Mitzenmacher, Michael
Augmented Large Language Models (LLMs) enhance the capabilities of standalone LLMs by integrating external data sources through API calls. In interactive LLM applications, efficient scheduling is crucial for maintaining low request completion times,
Externí odkaz:
http://arxiv.org/abs/2410.18248
Efficient scheduling is crucial for interactive Large Language Model (LLM) applications, where low request completion time directly impacts user engagement. Size-based scheduling algorithms like Shortest Remaining Process Time (SRPT) aim to reduce av
Externí odkaz:
http://arxiv.org/abs/2410.01035
Multimodal large language models (MLLMs) have extended the success of large language models (LLMs) to multiple data types, such as image, text and audio, achieving significant performance in various domains, including multimodal translation, visual q
Externí odkaz:
http://arxiv.org/abs/2408.03505
The trend of modeless ML inference is increasingly growing in popularity as it hides the complexity of model inference from users and caters to diverse user and application accuracy requirements. Previous work mostly focuses on modeless inference in
Externí odkaz:
http://arxiv.org/abs/2405.19213
Autor:
Lee, Benjamin C., Brooks, David, van Benthem, Arthur, Gupta, Udit, Hills, Gage, Liu, Vincent, Pierce, Benjamin, Stewart, Christopher, Strubell, Emma, Wei, Gu-Yeon, Wierman, Adam, Yao, Yuan, Yu, Minlan
Computing is at a moment of profound opportunity. Emerging applications -- such as capable artificial intelligence, immersive virtual realities, and pervasive sensor systems -- drive unprecedented demand for computer. Despite recent advances toward n
Externí odkaz:
http://arxiv.org/abs/2405.13858
Autor:
Li, Minghao, Basat, Ran Ben, Vargaftik, Shay, Lao, ChonLam, Xu, Kevin, Mitzenmacher, Michael, Yu, Minlan
Deep neural networks (DNNs) are the de facto standard for essential use cases, such as image classification, computer vision, and natural language processing. As DNNs and datasets get larger, they require distributed training on increasingly larger c
Externí odkaz:
http://arxiv.org/abs/2302.08545