Investigation of the performance of LU decomposition method using CUDA
Autor: | Caner Ozcan, Baha Sen |
---|---|
Rok vydání: | 2012 |
Předmět: |
LU decomposition
Floating point Computer science Triangular matrix Parallel computing law.invention Computational science symbols.namesake CUDA Gaussian elimination dense linear systems law Linear algebra symbols cuda General Earth and Planetary Sciences Central processing unit General-purpose computing on graphics processing units gpu computing General Environmental Science |
Zdroj: | Procedia Technology. 1:50-54 |
ISSN: | 2212-0173 |
DOI: | 10.1016/j.protcy.2012.02.011 |
Popis: | In recent years, parallel processing has been widely used in the computer industry. Software developers, have to deal with parallel computing platforms and technologies to provide novel and rich experiences. We present a novel algorithm to solve dense linear systems using Compute Unified Device Architecture (CUDA). High-level linear algebra operations require intensive computation. In this study Graphics Processing Units (GPU) accelerated implementation of LU linear algebra routine is implemented. LU decomposition is a decomposition of the form A=LU where A is a square matrix. The main idea of the LU decomposition is to record the steps used in Gaussian elimination on A in the places where the zero is produced. L and U are lower and upper triangular matrices respectively. This means that L has only zeros above the diagonal and U has only zeros below the diagonal. We have worked to increase performance with proper data representation and reducing row operations on GPU. Because of the high arithmetic throughput of GPUs, initial results from experiments promised a bright future for GPU computing. It has been shown useful for scientific computations. GPUs have high memory bandwidth and more floating point units as compared to the CPU. We have tried our study on different systems that have different GPUs and CPUs. The computation studies were also evaluated for different linear systems. When we compared the results obtained from both systems, a better performance was obtained with GPU computing. According to results, GPU computation approximately worked 3 times faster than the CPU computation. Our implementation provides significant performance improvement so we can easily use it to solve dense linear system. |
Databáze: | OpenAIRE |
Externí odkaz: |