Zobrazeno 1 - 10
of 34
pro vyhledávání: '"Guei-Yuan Lueh"'
Autor:
Daniel Rhee, Hongzheng Li, Wei-Yu Chen, Gang Chen, Hong Jiang, Joel Fuentes, Fangwen Fu, Kaiyu Chen, Guei-Yuan Lueh
Publikováno v:
CGO
The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implicatio
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e1d57836035b38d8071333c7ac1d9daa
Publikováno v:
Parallel Processing and Applied Mathematics ISBN: 9783030432287
PPAM (1)
PPAM (1)
Non-blocking data structures are commonly used in multi-threaded applications and their implementation is based on the use of atomic operations. New computing architectures have incorporated data-parallel processing through SIMD instructions on integ
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::5baa3f3aac781a97443417496fe49873
https://doi.org/10.1007/978-3-030-43229-4_33
https://doi.org/10.1007/978-3-030-43229-4_33
Publikováno v:
IPDPS Workshops
With the advent of computing systems with on-die integrated graphics processing unit (iGPU), new general-purpose GPU programming challenges have emerged from these heterogeneous processors. We propose a lock-free skiplist for Intel's integrated graph
Autor:
Anupama Chandrasekhar, Gang Chen, Po-Yu Chen, Wei-Yu Chen, Junjie Gu, Peng Guo, Shruthi Hebbur Prasanna Kumar, Guei-Yuan Lueh, Pankaj Mistry, Wei Pan, Thomas Raoux, Konrad Trifunovic
Publikováno v:
2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
Publikováno v:
CGO
Register allocation is a well-studied problem, but surprisingly little work has been published on assigning registers for GPU architectures. In this paper we present the register allocator in the production compiler for Intel HD and Iris Graphics. In
Autor:
Jamison D. Collins, Hong Jiang, Hong Wang, Perry Wang, Thomas A. Piazza, Gautham N. Chinya, Guei-Yuan Lueh
Publikováno v:
ACM SIGOPS Operating Systems Review. 45:11-20
In this paper, we introduce Bothnia, an extension to the Intel production graphics driver to support a shared virtual memory heterogeneous multithreading programming model. With Bothnia, the Intel graphics device driver can support both the tradition
Autor:
Hong Jiang, Gautham N. Chinya, Guei-Yuan Lueh, Jamison D. Collins, Nick Y. Yang, Hong Wang, Perry Wang, Xinmin Tian, Milind B. Girkar
Publikováno v:
PLDI
Future mainstream microprocessors will likely integrate specialized accelerators, such as GPUs, onto a single die to achieve better performance and power efficiency. However, it remains a keen challenge to program such a heterogeneous multicore platf
Publikováno v:
PLDI
A high-performance implementation of a Java Virtual Machine (JVM) consists of efficient implementation of Just-In-Time (JIT) compilation, exception handling, synchronization mechanism, and garbage collection (GC). These components are tightly coupled
Publikováno v:
ACM Transactions on Programming Languages and Systems. 22:431-470
The register allocation phase of a compiler maps live ranges of a program to registers. If there are more candidates than there are physical registers, the register allocator must spill a live range (the home location is in memory) or split a live ra
Publikováno v:
PLDI
A high-performance implementation of a Java Virtual Machine 1 requires a compiler to translate Java bytecodes into native instructions, as well as an advanced garbage collector (e.g., copying or generational). When the Java heap is exhausted and the