Interactive Entity Centric Analysis of Log Data

Autor: Buqiao Deng, Cui Wei, Xiongpai Qin, Qiao Sun
Rok vydání: 2017
Předmět:
Zdroj: Web and Big Data ISBN: 9783319635637
APWeb/WAIM (2)
DOI: 10.1007/978-3-319-63564-4_34
Popis: Interactive entity centric analysis of log data can help us gain fine granularity insights on business. In this paper, firstly we describe a fiber based partitioning method for log data, which accelerate later entity centric analysis. Secondly, we present our fiber based partitioner which is used by Spark SQL query engine. Fiber based partitioner takes locations of data blocks into account when loading data from HDFS into RDD, and when shuffling data from upstream operators to downstream operators during joining, avoids data interchange between node and speeds up query processing. Finally, we present our experiment results which demonstrates that fiber based partitioner improve entity centric queries.
Databáze: OpenAIRE