Výsledky vyhledávání

Report

FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale

Autor: Zhu, Zeyu, Wang, Peisong, Hu, Qinghao, Li, Gang, Liang, Xiaoyao, Cheng, Jian

Graph Neural Networks (GNNs) have shown great superiority on non-Euclidean graph data, achieving ground-breaking performance on various graph-related tasks. As a practical solution to train GNN on large graphs with billions of nodes and edges, the sa

Externí odkaz: http://arxiv.org/abs/2409.14939

Zobrazit plný text záznamu

Report

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Autor: Xue, Fuzhao, Chen, Yukang, Li, Dacheng, Hu, Qinghao, Zhu, Ligeng, Li, Xiuyu, Fang, Yunhao, Tang, Haotian, Yang, Shang, Liu, Zhijian, He, Ethan, Yin, Hongxu, Molchanov, Pavlo, Kautz, Jan, Fan, Linxi, Zhu, Yuke, Lu, Yao, Han, Song

Long-context capability is critical for multi-modal foundation models, especially for long video understanding. We introduce LongVILA, a full-stack solution for long-context visual-language models by co-designing the algorithm and system. For model t

Externí odkaz: http://arxiv.org/abs/2408.10188

Zobrazit plný text záznamu

Report

Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

Autor: Duan, Jiangfei, Zhang, Shuo, Wang, Zerui, Jiang, Lijuan, Qu, Wenwen, Hu, Qinghao, Wang, Guoteng, Weng, Qizhen, Yan, Hang, Zhang, Xingcheng, Qiu, Xipeng, Lin, Dahua, Wen, Yonggang, Jin, Xin, Zhang, Tianwei, Sun, Peng

Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with their sophisticated capabilities. Training these models requires vast GPU clusters and significant computing time, posing major challenges in terms of scalabilit

Externí odkaz: http://arxiv.org/abs/2407.20018

Zobrazit plný text záznamu

Report

TorchGT: A Holistic System for Large-scale Graph Transformer Training

Autor: Zhang, Meng, Sun, Jie, Hu, Qinghao, Sun, Peng, Wang, Zeke, Wen, Yonggang, Zhang, Tianwei

Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-world graphs involving up to millions of nodes. We obs

Externí odkaz: http://arxiv.org/abs/2407.14106

Zobrazit plný text záznamu

Report

LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism

Autor: Gu, Diandian, Sun, Peng, Hu, Qinghao, Huang, Ting, Chen, Xun, Xiong, Yingtong, Wang, Guoteng, Chen, Qiaoling, Zhao, Shangchun, Fang, Jiarui, Wen, Yonggang, Zhang, Tianwei, Jin, Xin, Liu, Xuanzhe

Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency

Externí odkaz: http://arxiv.org/abs/2406.18485

Zobrazit plný text záznamu

Report

Characterization of Large Language Model Development in the Datacenter

Autor: Hu, Qinghao, Ye, Zhisheng, Wang, Zerui, Wang, Guoteng, Zhang, Meng, Chen, Qiaoling, Sun, Peng, Lin, Dahua, Wang, Xiaolin, Luo, Yingwei, Wen, Yonggang, Zhang, Tianwei

Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs, often riddled with numerous challenges such as fr

Externí odkaz: http://arxiv.org/abs/2403.07648

Zobrazit plný text záznamu

Report

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding

Autor: Chen, Qiaoling, Gu, Diandian, Wang, Guoteng, Chen, Xun, Xiong, YingTong, Huang, Ting, Hu, Qinghao, Jin, Xin, Wen, Yonggang, Zhang, Tianwei, Sun, Peng

Large language models (LLMs) with long sequences begin to power more and more fundamentally new applications we use every day. Existing methods for long-sequence LLM training are neither efficient nor compatible with commonly-used training algorithms

Externí odkaz: http://arxiv.org/abs/2401.09149

Zobrazit plný text záznamu

Report

MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization

Autor: Zhu, Zeyu, Li, Fanrong, Li, Gang, Liu, Zejian, Mo, Zitao, Hu, Qinghao, Liang, Xiaoyao, Cheng, Jian

Graph Neural Networks (GNNs) are becoming a promising technique in various domains due to their excellent capabilities in modeling non-Euclidean data. Although a spectrum of accelerators has been proposed to accelerate the inference of GNNs, our anal

Externí odkaz: http://arxiv.org/abs/2311.09775

Zobrazit plný text záznamu

Report

AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training

Autor: Chen, Qiaoling, Hu, Qinghao, Wang, Guoteng, Xiong, Yingtong, Huang, Ting, Chen, Xun, Gao, Yang, Yan, Hang, Wen, Yonggang, Zhang, Tianwei, Sun, Peng

Training large language models (LLMs) encounters challenges in GPU memory consumption due to the high memory requirements of model states. The widely used Zero Redundancy Optimizer (ZeRO) addresses this issue through strategic sharding but introduces

Externí odkaz: http://arxiv.org/abs/2311.00257

Zobrazit plný text záznamu

Report

SpikingNeRF: Making Bio-inspired Neural Networks See through the Real World

Autor: Yao, Xingting, Hu, Qinghao, Liu, Tielong, Mo, Zitao, Zhu, Zeyu, Zhuge, Zhengyang, Cheng, Jian

Spiking neural networks (SNNs) have been thriving on numerous tasks to leverage their promising energy efficiency and exploit their potentialities as biologically plausible intelligence. Meanwhile, the Neural Radiance Fields (NeRF) render high-qualit

Externí odkaz: http://arxiv.org/abs/2309.10987

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání