Výsledky vyhledávání - "Guo, Chuanxiong"

Report

Collie: Finding Performance Anomalies in RDMA Subsystems

Autor: Kong, Xinhao, Zhu, Yibo, Zhou, Huaping, Jiang, Zhuo, Ye, Jianxi, Guo, Chuanxiong, Zhuo, Danyang

High-speed RDMA networks are getting rapidly adopted in the industry for their low latency and reduced CPU overheads. To verify that RDMA can be used in production, system administrators need to understand the set of application workloads that can po

Externí odkaz: http://arxiv.org/abs/2304.11467

Zobrazit plný text záznamu

Report

dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training

Autor: Hu, Hanpeng, Jiang, Chenyu, Zhong, Yuchen, Peng, Yanghua, Wu, Chuan, Zhu, Yibo, Lin, Haibin, Guo, Chuanxiong

Distributed training using multiple devices (e.g., GPUs) has been widely adopted for learning DNN models over large datasets. However, the performance of large-scale distributed training tends to be far from linear speed-up in practice. Given the com

Externí odkaz: http://arxiv.org/abs/2205.02473

Zobrazit plný text záznamu

Report

Aryl: An Elastic Cluster Scheduler for Deep Learning

Autor: Li, Jiamin, Xu, Hong, Zhu, Yibo, Liu, Zherui, Guo, Chuanxiong, Wang, Cong

Companies build separate training and inference GPU clusters for deep learning, and use separate schedulers to manage them. This leads to problems for both training and inference: inference clusters have low GPU utilization when the traffic load is l

Externí odkaz: http://arxiv.org/abs/2202.07896

Zobrazit plný text záznamu

Report

Prediction of GPU Failures Under Deep Learning Workloads

Autor: Liu, Heting, Li, Zhichao, Tan, Cheng, Yang, Rongqiu, Cao, Guohong, Liu, Zherui, Guo, Chuanxiong

Graphics processing units (GPUs) are the de facto standard for processing deep learning (DL) tasks. Meanwhile, GPU failures, which are inevitable, cause severe consequences in DL tasks: they disrupt distributed trainings, crash inference services, an

Externí odkaz: http://arxiv.org/abs/2201.11853

Zobrazit plný text záznamu

Report

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Autor: Liu, Tianfeng, Chen, Yangrui, Li, Dan, Wu, Chuan, Zhu, Yibo, He, Jun, Peng, Yanghua, Chen, Hongzheng, Chen, Hongzhi, Guo, Chuanxiong

Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data, achieving ground-breaking performance on various tasks such as node classification and graph property prediction. Nonetheless, existing

Externí odkaz: http://arxiv.org/abs/2112.08541

Zobrazit plný text záznamu

Report

Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem

Autor: Tan, Cheng, Li, Zhichao, Zhang, Jian, Cao, Yu, Qi, Sikai, Liu, Zherui, Zhu, Yibo, Guo, Chuanxiong

Multi-Instance GPU (MIG) is a new feature introduced by NVIDIA A100 GPUs that partitions one physical GPU into multiple GPU instances. With MIG, A100 can be the most cost-efficient GPU ever for serving Deep Neural Networks (DNNs). However, discoverin

Externí odkaz: http://arxiv.org/abs/2109.11067

Zobrazit plný text záznamu

Report

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

Autor: Jin, Yuchen, Zhou, Tianyi, Zhao, Liangyu, Zhu, Yibo, Guo, Chuanxiong, Canini, Marco, Krishnamurthy, Arvind

The learning rate (LR) schedule is one of the most important hyper-parameters needing careful tuning in training DNNs. However, it is also one of the least automated parts of machine learning systems and usually costs significant manual effort and co

Externí odkaz: http://arxiv.org/abs/2105.10762

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání