Výsledky vyhledávání - "Sooraj Puthoor"

Turn-based Spatiotemporal Coherence for GPUs

Publikováno v: ACM Transactions on Architecture and Code Optimization.

This paper introduces turn-based spatiotemporal coherence. Spatiotemporal coherence is a novel coherence implementation that assigns write permission to epochs (or turns) as opposed to a processor core. This paradigm shift in the assignment of write

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::5ae90c7da87a862d94f62b0fbf744dd4
https://doi.org/10.1145/3593054

Zobrazit plný text záznamu

Systems-on-Chip with Strong Ordering

Autor: Mikko H. Lipasti, Sooraj Puthoor

Publikováno v: ACM Transactions on Architecture and Code Optimization. 18:1-27

Sequential consistency (SC) is the most intuitive memory consistency model and the easiest for programmers and hardware designers to reason about. However, the strict memory ordering restrictions imposed by SC make it less attractive from a performan

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::7744c9ca65eccee9f8e926a4322d000d
https://doi.org/10.1145/3428153

Zobrazit plný text záznamu

Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU Tasks

Autor: Anirudh Mohan Kaushik, Noah Wolfe, Noel Chalmers, Bradford M. Beckmann, Scott Moe, Ashwin M. Aji, Muhammad Amber Hassaan, Sooraj Puthoor

Publikováno v: IISWC

General-Purpose Graphics Processing Units (GPGPUs) are employed in today's fastest supercomputers to accelerate a variety of scientific compute workloads. These workloads typically comprise of data-parallel mathematical kernels that are well suited f

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::daefac063a156f13ae54846daa648559
https://doi.org/10.1109/iiswc47752.2019.9042134

Zobrazit plný text záznamu

Optimizing GPU Cache Policies for MI Workloads

Autor: Bradford M. Beckmann, Tsung Tai Yeh, Xianwei Zhang, Alexandru Dutu, Anthony Gutierrez, Onur Kayiran, Srikant Bharadwaj, Matthew D. Sinclair, Michael LeBeane, Sooraj Puthoor, Johnathan Alsop, Brandon Potter

Publikováno v: IISWC

In recent years, machine intelligence (MI) applications have emerged as a major driver for the computing industry. Optimizing these workloads is important, but complicated. As memory demands grow and data movement overheads increasingly limit perform

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::495afa77ed128b45d0a50ee929828600
https://doi.org/10.1109/iiswc47752.2019.9041977

Zobrazit plný text záznamu

Compiler assisted coalescing

Autor: Sooraj Puthoor, Mikko H. Lipasti

Publikováno v: PACT

Tightly integrated CPU-GPU systems that share the same virtual address space have significantly improved the programmability of GPUs in recent years. However, to achieve this, every memory access from a GPU has to go through an address translation un

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::bf2ee37671c0024d55e864fa4c10189e
https://doi.org/10.1145/3243176.3243203

Zobrazit plný text záznamu

A Case for Scoped Persist Barriers in GPUs

Autor: Mitesh R. Meswani, Arkaprava Basu, Sooraj Puthoor, Dibakar Gope

Publikováno v: GPGPU@PPoPP

Two key trends in computing are evident --- emergence of GPU as a first-class compute element and emergence of byte-addressable nonvolatile memory technologies (NVRAM) as DRAM-supplement. GPUs and NVRAMs are likely to coexist in future systems. Howev

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::250b1133bd8e57e5cc1fc739035520a4
https://doi.org/10.1145/3180270.3180275

Zobrazit plný text záznamu

Oversubscribed Command Queues in GPUs

Autor: Joseph Gross, Sooraj Puthoor, Bradford M. Beckmann, Xulong Tang

Publikováno v: GPGPU@PPoPP

As GPUs become larger and provide an increasing number of parallel execution units, a single kernel is no longer sufficient to utilize all available resources. As a result, GPU applications are beginning to use fine-grain asynchronous kernels, which

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::553078f3758e9e6b17db6d01fd564ed1
https://doi.org/10.1145/3180270.3180271

Zobrazit plný text záznamu

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level

Autor: Bradford M. Beckmann, Alexandru Dutu, Akshay Jain, Anthony Gutierrez, John Kalamatianos, Sooraj Puthoor, Xianwei Zhang, Matthew D. Sinclair, Mark Wyse, Joseph Gross, Jieming Yin, Brandon Potter, Timothy G. Rogers, Matthew Poremba, Onur Kayiran, Michael LeBeane

Publikováno v: HPCA

Modern GPU frameworks use a two-phase compilation approach. Kernels written in a high-level language are initially compiled to an implementation agnostic intermediate language (IL), then finalized to the machine ISA only when the target GPU hardware

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::14f1f20f30506b426664c079a8d2466f
https://doi.org/10.1109/hpca.2018.00058

Zobrazit plný text záznamu

Software Assisted Hardware Cache Coherence for Heterogeneous Processors

Autor: Sooraj Puthoor, Arkaprava Basu, Shuai Che, Bradford M. Beckmann

Publikováno v: MEMSYS

Current trends suggest that future computing platforms will be increasingly heterogeneous. While these heterogeneous processors physically integrate disparate computing elements like CPUs and GPUs on a single chip, their programmability critically de

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::1debea192eafc6e1a9c205f4792aaf27
https://doi.org/10.1145/2989081.2989092

Zobrazit plný text záznamu

Implementing directed acyclic graphs with the heterogeneous system architecture

Autor: Wei Wu, Sooraj Puthoor, Bradford M. Beckmann, Shuai Che, Mayank Daga, Gregory Rodgers, Ashwin M. Aji

Publikováno v: GPGPU@PPoPP

Achieving optimal performance on heterogeneous computing systems requires a programming model that supports the execution of asynchronous, multi-stream, and out-of-order tasks in a shared memory environment. Asynchronous dependency-driven tasking is

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::936ac1389b596dbd461efefb91374b89
https://doi.org/10.1145/2884045.2884052

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání