An Efficient Graphics Processing Unit Scheme for Complex Geometry Simulations Using the Lattice Boltzmann Method
Autor: | Binghai Wen, Gang Huang, Hongyin Zhu, Zhangrong Qin, Xin Xu |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
General Computer Science
Discretization Computer science Computation Addressing scheme General Engineering Graphics processing unit Lattice Boltzmann methods 01 natural sciences 010305 fluids & plasmas Computational science 010101 applied mathematics Complex geometry graphic processing unit (GPU) lattice Boltzmann method Lattice (order) 0103 physical sciences Fluid dynamics General Materials Science complex geometry lcsh:Electrical engineering. Electronics. Nuclear engineering 0101 mathematics lcsh:TK1-9971 |
Zdroj: | IEEE Access, Vol 8, Pp 185158-185168 (2020) |
ISSN: | 2169-3536 |
Popis: | The lattice Boltzmann method has been fully discretized in space, time, and velocity; its inherent parallelism makes it outstanding for use in accelerated computation by graphics processing unit in large-scale simulations of fluid dynamics. When the lattice Boltzmann method is used to simulate a fluid system with complex geometry, the flow field is usually compressed to reduce memory consumption, and fluid nodes are accessed indirectly to improve computational efficiency. We designed a pointer array that is the same size as the flow field and is based on the Compute Unified Device Architecture platform’s unified memory technology. The addresses of the fluid nodes are stored in this array, and the other nodes, which are unallocated, are marked as null. For obtaining the coordinates of the fluid nodes in the original flow field, we stored the addresses of the pointer array units whose values were not null as part of the lattice attribute at the end of the lattice attribute array, forming a cyclic pointer structure to track geometric information. We validated the feasibility of this addressing scheme using an experimental simulation of aqueous humor in the anterior segment of the eye, and tested its performance on the graphics processing unit of Pascal, Volta, and Turing architecture. The present method carefully distributes data to generate fewer memory transactions and to reduce access times of the global memory, thus achieving approximately 18% performance improvement. |
Databáze: | OpenAIRE |
Externí odkaz: |