Zobrazeno 1 - 10
of 11
pro vyhledávání: '"Shane Ryoo"'
Publikováno v:
Computing in Science & Engineering. 11:16-26
Graphics processing units (GPUs) can provide excellent speedups on some, but not all, general-purpose workloads. Using a set of computational GPU kernels as examples, the authors show how to adapt kernels to utilize the architectural features of a Ge
Autor:
Shane Ryoo, Sain-Zee Ueng, Wen-mei W. Hwu, Sara S. Baghsorkhi, Christopher I. Rodrigues, John A. Stratton, Sam S. Stone
Publikováno v:
Journal of Parallel and Distributed Computing. 68:1389-1401
Contemporary many-core processors such as the GeForce 8800 GTX enable application developers to utilize various levels of parallelism to enhance the performance of their applications. However, iterative optimization for such a system may lead to a lo
Publikováno v:
IEEE Micro. 26:40-47
Microprocessors exploit instruction-level parallelism and tolerate memory-access latencies to achieve high-performance. Out-of-order microprocessors do this by dynamically scheduling instruction execution, but require power-hungry hardware structures
Publikováno v:
Languages and Compilers for Parallel Computing ISBN: 9783540852605
LCPC
LCPC
Media and scientific simulation applications have a large amount of parallelism that can be exploited in contemporary multi-core microprocessors. However, traditional pointer and array analysis techniques often fall short in automatically identifying
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::9cb832317092533ab6f7c4decf137ee0
https://doi.org/10.1007/978-3-540-85261-2_8
https://doi.org/10.1007/978-3-540-85261-2_8
Autor:
Wen-mei W. Hwu, Sam S. Stone, Sain-Zee Ueng, John A. Stratton, Shane Ryoo, Sara S. Baghsorkhi, Christopher I. Rodrigues
Publikováno v:
CGO
Program optimization for highly-parallel systems has historically been considered an art, with experts doing much of the performance tuning by hand. With the introduction of inexpensive, single-chip, massively parallel platforms, more developers will
Autor:
Christopher I. Rodrigues, Wen-mei W. Hwu, Sam S. Stone, David B. Kirk, Shane Ryoo, Sara S. Baghsorkhi
Publikováno v:
PPOPP
GPUs have recently attracted the attention of many application developers as commodity data-parallel coprocessors. The newest generations of GPU architecture provide easier programmability and increased generality while maintaining the tremendous mem
Publikováno v:
ICS
Data-parallel co-processors have the potential to improve performance in highly parallel regions of code when coupled to a general-purpose CPU. However, applications often have to be modified in non-intuitive and complicated ways to mitigate the cost
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bd1b72949c97b1d1a766a63e9ece5877
Autor:
Daviid Kiirk, Kuangweii Huang, Shane Ryoo, Wen-meii Hwu, John A. Stratton, Chriistopher Rodriigues
Publikováno v:
2007 IEEE Hot Chips 19 Symposium (HCS).
This article consists of a collection of slides from the author's conference presentation on NVIDIA's GeForce 8800 GTX family of products. Some of the specific topics discussed include: the special features, system specifications, and system design f
Autor:
Matthew I. Frank, Wen-mei W. Hwu, Sain-Zee Ueng, Christopher I. Rodrigues, Robert E. Kidd, Shane Ryoo
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783540715276
With the increasing use of multi-core microprocessors and hardware accelerators in embedded media processing systems, there is an increasing need to discover coarse-grained parallelism in media applications written in C and C++. Common versions of th
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::330d3ac3ad0655385c8bf1c596554097
https://doi.org/10.1007/978-3-540-71528-3_13
https://doi.org/10.1007/978-3-540-71528-3_13
Autor:
Robert E. Kidd, Sara S. Baghsorkhi, Steve Lumetta, Matthew I. Frank, Isaac Gelado, Stephanie C. Tsao, John H. Kelm, Sain-Zee Ueng, Aqeel Mahesri, Nacho Navarro, Wen-mei W. Hwu, Sam S. Stone, Sanjay J. Patel, Shane Ryoo
Publikováno v:
DAC
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm- level parallelism,