Zobrazeno 1 - 10
of 13
pro vyhledávání: '"Sooraj Puthoor"'
Autor:
Sooraj Puthoor, Mikko H. Lipasti
Publikováno v:
ACM Transactions on Architecture and Code Optimization.
This paper introduces turn-based spatiotemporal coherence. Spatiotemporal coherence is a novel coherence implementation that assigns write permission to epochs (or turns) as opposed to a processor core. This paradigm shift in the assignment of write
Autor:
Mikko H. Lipasti, Sooraj Puthoor
Publikováno v:
ACM Transactions on Architecture and Code Optimization. 18:1-27
Sequential consistency (SC) is the most intuitive memory consistency model and the easiest for programmers and hardware designers to reason about. However, the strict memory ordering restrictions imposed by SC make it less attractive from a performan
Autor:
Anirudh Mohan Kaushik, Noah Wolfe, Noel Chalmers, Bradford M. Beckmann, Scott Moe, Ashwin M. Aji, Muhammad Amber Hassaan, Sooraj Puthoor
Publikováno v:
IISWC
General-Purpose Graphics Processing Units (GPGPUs) are employed in today's fastest supercomputers to accelerate a variety of scientific compute workloads. These workloads typically comprise of data-parallel mathematical kernels that are well suited f
Autor:
Bradford M. Beckmann, Tsung Tai Yeh, Xianwei Zhang, Alexandru Dutu, Anthony Gutierrez, Onur Kayiran, Srikant Bharadwaj, Matthew D. Sinclair, Michael LeBeane, Sooraj Puthoor, Johnathan Alsop, Brandon Potter
Publikováno v:
IISWC
In recent years, machine intelligence (MI) applications have emerged as a major driver for the computing industry. Optimizing these workloads is important, but complicated. As memory demands grow and data movement overheads increasingly limit perform
Autor:
Sooraj Puthoor, Mikko H. Lipasti
Publikováno v:
PACT
Tightly integrated CPU-GPU systems that share the same virtual address space have significantly improved the programmability of GPUs in recent years. However, to achieve this, every memory access from a GPU has to go through an address translation un
Publikováno v:
GPGPU@PPoPP
Two key trends in computing are evident --- emergence of GPU as a first-class compute element and emergence of byte-addressable nonvolatile memory technologies (NVRAM) as DRAM-supplement. GPUs and NVRAMs are likely to coexist in future systems. Howev
Publikováno v:
GPGPU@PPoPP
As GPUs become larger and provide an increasing number of parallel execution units, a single kernel is no longer sufficient to utilize all available resources. As a result, GPU applications are beginning to use fine-grain asynchronous kernels, which
Autor:
Bradford M. Beckmann, Alexandru Dutu, Akshay Jain, Anthony Gutierrez, John Kalamatianos, Sooraj Puthoor, Xianwei Zhang, Matthew D. Sinclair, Mark Wyse, Joseph Gross, Jieming Yin, Brandon Potter, Timothy G. Rogers, Matthew Poremba, Onur Kayiran, Michael LeBeane
Publikováno v:
HPCA
Modern GPU frameworks use a two-phase compilation approach. Kernels written in a high-level language are initially compiled to an implementation agnostic intermediate language (IL), then finalized to the machine ISA only when the target GPU hardware
Publikováno v:
MEMSYS
Current trends suggest that future computing platforms will be increasingly heterogeneous. While these heterogeneous processors physically integrate disparate computing elements like CPUs and GPUs on a single chip, their programmability critically de
Autor:
Wei Wu, Sooraj Puthoor, Bradford M. Beckmann, Shuai Che, Mayank Daga, Gregory Rodgers, Ashwin M. Aji
Publikováno v:
GPGPU@PPoPP
Achieving optimal performance on heterogeneous computing systems requires a programming model that supports the execution of asynchronous, multi-stream, and out-of-order tasks in a shared memory environment. Asynchronous dependency-driven tasking is