GPU-Accelerated High-Throughput Online Stream Data Processing
Autor: | Jielong Xu, Charles A. Kamhoua, Kevin Kwiat, Jian Tang, Chonggang Wang, Zhenhua Chen |
---|---|
Rok vydání: | 2018 |
Předmět: |
Data processing
Information Systems and Management Data stream mining Computer science 020208 electrical & electronic engineering 02 engineering and technology Parallel computing GeneralLiterature_MISCELLANEOUS Stream processing Parallel processing (DSP implementation) 020204 information systems 0202 electrical engineering electronic engineering information engineering SIMD General-purpose computing on graphics processing units Throughput (business) Massively parallel Information Systems |
Zdroj: | IEEE Transactions on Big Data. 4:191-202 |
ISSN: | 2372-2096 |
DOI: | 10.1109/tbdata.2016.2616116 |
Popis: | The Single Instruction Multiple Data (SIMD) architecture of Graphic Processing Units (GPUs) makes them perfect for parallel processing of big data. In this paper, we present the design, implementation and evaluation of G-Storm , a GPU-enabled parallel system based on Storm, which harnesses the massively parallel computing power of GPUs for high-throughput online stream data processing. G-Storm has the following desirable features: 1) G-Storm is designed to be a general data processing platform as Storm, which can handle various applications and data types. 2) G-Storm exposes GPUs to Storm applications while preserving its easy-to-use programming model. 3) G-Storm achieves high-throughput and low-overhead data processing with GPUs. 4) G-Storm accelerates data processing further by enabling Direct Data Transfer (DDT), between two executors that process data at a common GPU. We implemented G-Storm based on Storm 0.9.2 and tested it using three different applications, including continuous query, matrix multiplication and image resizing. Extensive experimental results show that 1) Compared to Storm, G-Storm achieves over 7× improvement on throughput for continuous query, while maintaining reasonable average tuple processing time. It also leads to 2.3× and 1.3× throughput improvements on the other two applications, respectively. 2) DDT significantly reduces data processing time. |
Databáze: | OpenAIRE |
Externí odkaz: |