Zobrazeno 1 - 10
of 273
pro vyhledávání: '"OLUKOTUN, KUNLE"'
We propose DFModel, a modeling framework for mapping dataflow computation graphs onto large-scale systems. Mapping a workload to a system requires optimizing dataflow mappings at various levels, including the inter-chip (between chips) level and the
Externí odkaz:
http://arxiv.org/abs/2412.16432
Autor:
Prabhakar, Raghu, Sivaramakrishnan, Ram, Gandhi, Darshan, Du, Yun, Wang, Mingran, Song, Xiangyu, Zhang, Kejie, Gao, Tianren, Wang, Angela, Li, Karen, Sheng, Yongning, Brot, Joshua, Sokolov, Denis, Vivek, Apurv, Leung, Calvin, Sabnis, Arjun, Bai, Jiayu, Zhao, Tuowen, Gottscho, Mark, Jackson, David, Luttrell, Mark, Shah, Manish K., Chen, Edison, Liang, Kaizhao, Jain, Swayambhoo, Thakker, Urmish, Huang, Dawei, Jairath, Sumti, Brown, Kevin J., Olukotun, Kunle
Monolithic large language models (LLMs) like GPT-4 have paved the way for modern generative AI applications. Training, serving, and maintaining monolithic LLMs at scale, however, remains prohibitively expensive and challenging. The disproportionate i
Externí odkaz:
http://arxiv.org/abs/2405.07518
Transformer models serve as the backbone of many state-ofthe-art language models, and most use the scaled dot-product attention (SDPA) mechanism to capture relationships between tokens. However, the straightforward implementation of SDPA has quadrati
Externí odkaz:
http://arxiv.org/abs/2404.16629
Autor:
Rucker, Alexander, Sundram, Shiv, Smith, Coleman, Vilim, Matthew, Prabhakar, Raghu, Kjolstad, Fredrik, Olukotun, Kunle
Spatial dataflow architectures such as reconfigurable dataflow accelerators (RDA) can provide much higher performance and efficiency than CPUs and GPUs. In particular, vectorized reconfigurable dataflow accelerators (vRDA) in recent literature repres
Externí odkaz:
http://arxiv.org/abs/2302.06124
Autor:
Hellsten, Erik, Souza, Artur, Lenfers, Johannes, Lacouture, Rubens, Hsu, Olivia, Ejjeh, Adel, Kjolstad, Fredrik, Steuwer, Michel, Olukotun, Kunle, Nardi, Luigi
We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particular
Externí odkaz:
http://arxiv.org/abs/2212.11142
We introduce Stardust, a compiler that compiles sparse tensor algebra to reconfigurable dataflow architectures (RDAs). Stardust introduces new user-provided data representation and scheduling language constructs for mapping to resource-constrained ac
Externí odkaz:
http://arxiv.org/abs/2211.03251
Autor:
Hsu, Olivia, Strange, Maxwell, Sharma, Ritvik, Won, Jaeyeon, Olukotun, Kunle, Emer, Joel, Horowitz, Mark, Kjolstad, Fredrik
Publikováno v:
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems Volume 3 (2023) 710-726
We propose the Sparse Abstract Machine (SAM), an abstract machine model for targeting sparse tensor algebra to reconfigurable and fixed-function spatial dataflow accelerators. SAM defines a streaming dataflow abstraction with sparse primitives that e
Externí odkaz:
http://arxiv.org/abs/2208.14610
Support for Machine Learning (ML) applications in networks has significantly improved over the last decade. The availability of public datasets and programmable switching fabrics (including low-level languages to program them) present a full-stack to
Externí odkaz:
http://arxiv.org/abs/2206.05592
As programmers turn to software-defined hardware (SDH) to maintain a high level of productivity while programming hardware to run complex algorithms, heavy-lifting must be done by the compiler to automatically partition on-chip arrays. In this paper,
Externí odkaz:
http://arxiv.org/abs/2202.01261
Autor:
Rucker, Alexander, Vilim, Matthew, Zhao, Tian, Zhang, Yaqi, Prabhakar, Raghu, Olukotun, Kunle
This paper proposes Capstan: a scalable, parallel-patterns-based, reconfigurable dataflow accelerator (RDA) for sparse and dense tensor applications. Instead of designing for one application, we start with common sparse data formats, each of which su
Externí odkaz:
http://arxiv.org/abs/2104.12760