Zobrazeno 1 - 10
of 109
pro vyhledávání: '"Hu, Qinghao"'
Graph Neural Networks (GNNs) have shown great superiority on non-Euclidean graph data, achieving ground-breaking performance on various graph-related tasks. As a practical solution to train GNN on large graphs with billions of nodes and edges, the sa
Externí odkaz:
http://arxiv.org/abs/2409.14939
Autor:
Xue, Fuzhao, Chen, Yukang, Li, Dacheng, Hu, Qinghao, Zhu, Ligeng, Li, Xiuyu, Fang, Yunhao, Tang, Haotian, Yang, Shang, Liu, Zhijian, He, Ethan, Yin, Hongxu, Molchanov, Pavlo, Kautz, Jan, Fan, Linxi, Zhu, Yuke, Lu, Yao, Han, Song
Long-context capability is critical for multi-modal foundation models, especially for long video understanding. We introduce LongVILA, a full-stack solution for long-context visual-language models by co-designing the algorithm and system. For model t
Externí odkaz:
http://arxiv.org/abs/2408.10188
Autor:
Duan, Jiangfei, Zhang, Shuo, Wang, Zerui, Jiang, Lijuan, Qu, Wenwen, Hu, Qinghao, Wang, Guoteng, Weng, Qizhen, Yan, Hang, Zhang, Xingcheng, Qiu, Xipeng, Lin, Dahua, Wen, Yonggang, Jin, Xin, Zhang, Tianwei, Sun, Peng
Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with their sophisticated capabilities. Training these models requires vast GPU clusters and significant computing time, posing major challenges in terms of scalabilit
Externí odkaz:
http://arxiv.org/abs/2407.20018
Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-world graphs involving up to millions of nodes. We obs
Externí odkaz:
http://arxiv.org/abs/2407.14106
Autor:
Gu, Diandian, Sun, Peng, Hu, Qinghao, Huang, Ting, Chen, Xun, Xiong, Yingtong, Wang, Guoteng, Chen, Qiaoling, Zhao, Shangchun, Fang, Jiarui, Wen, Yonggang, Zhang, Tianwei, Jin, Xin, Liu, Xuanzhe
Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency
Externí odkaz:
http://arxiv.org/abs/2406.18485
Autor:
Hu, Qinghao, Ye, Zhisheng, Wang, Zerui, Wang, Guoteng, Zhang, Meng, Chen, Qiaoling, Sun, Peng, Lin, Dahua, Wang, Xiaolin, Luo, Yingwei, Wen, Yonggang, Zhang, Tianwei
Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs, often riddled with numerous challenges such as fr
Externí odkaz:
http://arxiv.org/abs/2403.07648
Autor:
Chen, Qiaoling, Gu, Diandian, Wang, Guoteng, Chen, Xun, Xiong, YingTong, Huang, Ting, Hu, Qinghao, Jin, Xin, Wen, Yonggang, Zhang, Tianwei, Sun, Peng
Large language models (LLMs) with long sequences begin to power more and more fundamentally new applications we use every day. Existing methods for long-sequence LLM training are neither efficient nor compatible with commonly-used training algorithms
Externí odkaz:
http://arxiv.org/abs/2401.09149
Autor:
Zhu, Zeyu, Li, Fanrong, Li, Gang, Liu, Zejian, Mo, Zitao, Hu, Qinghao, Liang, Xiaoyao, Cheng, Jian
Graph Neural Networks (GNNs) are becoming a promising technique in various domains due to their excellent capabilities in modeling non-Euclidean data. Although a spectrum of accelerators has been proposed to accelerate the inference of GNNs, our anal
Externí odkaz:
http://arxiv.org/abs/2311.09775
Autor:
Chen, Qiaoling, Hu, Qinghao, Wang, Guoteng, Xiong, Yingtong, Huang, Ting, Chen, Xun, Gao, Yang, Yan, Hang, Wen, Yonggang, Zhang, Tianwei, Sun, Peng
Training large language models (LLMs) encounters challenges in GPU memory consumption due to the high memory requirements of model states. The widely used Zero Redundancy Optimizer (ZeRO) addresses this issue through strategic sharding but introduces
Externí odkaz:
http://arxiv.org/abs/2311.00257
Autor:
Yao, Xingting, Hu, Qinghao, Liu, Tielong, Mo, Zitao, Zhu, Zeyu, Zhuge, Zhengyang, Cheng, Jian
Spiking neural networks (SNNs) have been thriving on numerous tasks to leverage their promising energy efficiency and exploit their potentialities as biologically plausible intelligence. Meanwhile, the Neural Radiance Fields (NeRF) render high-qualit
Externí odkaz:
http://arxiv.org/abs/2309.10987