CU++: an object oriented framework for computational fluid dynamics applications using graphics processing units
Autor: | Dimitri J. Mavriplis, Dominic D. J. Chandar, Jayanarayanan Sitaraman |
---|---|
Rok vydání: | 2013 |
Předmět: |
Object-oriented programming
Operator overloading Interface (Java) Computer science Message Passing Interface Parallel computing Data structure Porting Metaprogramming Theoretical Computer Science CUDA Hardware and Architecture SIMD General-purpose computing on graphics processing units Software Parallel array Information Systems Compile time |
Zdroj: | The Journal of Supercomputing. 67:47-68 |
ISSN: | 1573-0484 0920-8542 |
DOI: | 10.1007/s11227-013-0985-9 |
Popis: | The application of graphics processing units (GPU) to solve partial differential equations is gaining popularity with the advent of improved computer hardware. Various lower level interfaces exist that allow the user to access GPU specific functions. One such interface is NVIDIA's Compute Unified Device Architecture (CUDA) library. However, porting existing codes to run on the GPU requires the user to write kernels that execute on multiple cores, in the form of Single Instruction Multiple Data (SIMD). In the present work, a higher level framework, termed CU++, has been developed that uses object oriented programming techniques available in C++ such as polymorphism, operator overloading, and template meta programming. Using this approach, CUDA kernels can be generated automatically during compile time. Briefly, CU++ allows a code developer with just C/C++ knowledge to write computer programs that will execute on the GPU without any knowledge of specific programming techniques in CUDA. This approach is tremendously beneficial for Computational Fluid Dynamics (CFD) code development because it mitigates the necessity of creating hundreds of GPU kernels for various purposes. In its current form, CU++ provides a framework for parallel array arithmetic, simplified data structures to interface with the GPU, and smart array indexing. An implementation of heterogeneous parallelism, i.e., utilizing multiple GPUs to simultaneously process a partitioned grid system with communication at the interfaces using Message Passing Interface (MPI) has been developed and tested. |
Databáze: | OpenAIRE |
Externí odkaz: |