Výsledky vyhledávání

Report

TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading

Autor: Wu, Kun, Park, Jeongmin Brian, Zhang, Xiaofan, Hidayetoğlu, Mert, Mailthody, Vikram Sharma, Huang, Sitao, Lumetta, Steven Sam, Hwu, Wen-mei

The growth rate of the GPU memory capacity has not been able to keep up with that of the size of large language models (LLMs), hindering the model training process. In particular, activations -- the intermediate tensors produced during forward propag

Externí odkaz: http://arxiv.org/abs/2408.10013

Zobrazit plný text záznamu

Report

HiCCL: A Hierarchical Collective Communication Library

Autor: Hidayetoglu, Mert, de Gonzalo, Simon Garcia, Slaughter, Elliott, Surana, Pinku, Hwu, Wen-mei, Gropp, William, Aiken, Alex

HiCCL (Hierarchical Collective Communication Library) addresses the growing complexity and diversity in high-performance network architectures. As GPU systems have envolved into networks of GPUs with different multilevel communication hierarchies, op

Externí odkaz: http://arxiv.org/abs/2408.05962

Zobrazit plný text záznamu

Report

LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme

Autor: Park, Jeongmin Brian, Wu, Kun, Mailthody, Vikram Sharma, Quresh, Zaid, Mahlke, Scott, Hwu, Wen-mei

Graph Neural Networks (GNNs) are widely used today in recommendation systems, fraud detection, and node/link classification tasks. Real world GNNs continue to scale in size and require a large memory footprint for storing graphs and embeddings that o

Externí odkaz: http://arxiv.org/abs/2407.15264

Zobrazit plný text záznamu

Report

Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level

Autor: Hassani, Ali, Hwu, Wen-Mei, Shi, Humphrey

Neighborhood attention reduces the cost of self attention by restricting each token's attention span to its nearest neighbors. This restriction, parameterized by a window size and dilation factor, draws a spectrum of possible attention patterns betwe

Externí odkaz: http://arxiv.org/abs/2403.04690

Zobrazit plný text záznamu

Report

RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design

Autor: Reidys, Benjamin, Xue, Yuqi, Li, Daixuan, Sukhwani, Bharat, Hwu, Wen-mei, Chen, Deming, Asaad, Sameh, Huang, Jian

Software-defined networking (SDN) and software-defined flash (SDF) have been serving as the backbone of modern data centers. They are managed separately to handle I/O requests. At first glance, this is a reasonable design by following the rack-scale

Externí odkaz: http://arxiv.org/abs/2309.06513

Zobrazit plný text záznamu

Report

CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs

Autor: Park, Jeongmin, Qureshi, Zaid, Mailthody, Vikram, Gacek, Andrew, Shao, Shunfan, AlMasri, Mohammad, Gelado, Isaac, Xiong, Jinjun, Newburn, Chris, Chung, I-hsin, Garland, Michael, Sakharnykh, Nikolay, Hwu, Wen-mei

Data compression and decompression have become vital components of big-data applications to manage the exponential growth in the amount of data collected and stored. Furthermore, big-data applications have increasingly adopted GPUs due to their high

Externí odkaz: http://arxiv.org/abs/2307.03760

Zobrazit plný text záznamu

Kniha

Heterogeneous system architecture : a new compute platform infrastructure / Wen-mei W. Hwu. [elektronicky zdroj]

Autor: Hwu, Wen-mei, author

Externí odkaz: Kolekce e-knih KNAV

Report

Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses

Autor: Park, Jeongmin Brian, Mailthody, Vikram Sharma, Qureshi, Zaid, Hwu, Wen-mei

Graph Neural Networks (GNNs) are emerging as a powerful tool for learning from graph-structured data and performing sophisticated inference tasks in various application domains. Although GNNs have been shown to be effective on modest-sized graphs, tr

Externí odkaz: http://arxiv.org/abs/2306.16384

Zobrazit plný text záznamu

Report

IGB: Addressing The Gaps In Labeling, Features, Heterogeneity, and Size of Public Graph Datasets for Deep Learning Research

Autor: Khatua, Arpandeep, Mailthody, Vikram Sharma, Taleka, Bhagyashree, Ma, Tengfei, Song, Xiang, Hwu, Wen-mei

Publikováno v: KDD 2023

Graph neural networks (GNNs) have shown high potential for a variety of real-world, challenging applications, but one of the major obstacles in GNN research is the lack of large-scale flexible datasets. Most existing public datasets for GNNs are rela

Externí odkaz: http://arxiv.org/abs/2302.13522

Zobrazit plný text záznamu

Report

Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures

Autor: Wu, Kun, Hidayetoğlu, Mert, Song, Xiang, Huang, Sitao, Zheng, Da, Nisa, Israt, Hwu, Wen-mei

Relational graph neural networks (RGNNs) are graph neural networks with dedicated structures for modeling the different types of nodes and edges in heterogeneous graphs. While RGNNs have been increasingly adopted in many real-world applications due t

Externí odkaz: http://arxiv.org/abs/2301.06284

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání