Interactive Entity Centric Analysis of Log Data
Autor: | Buqiao Deng, Cui Wei, Xiongpai Qin, Qiao Sun |
---|---|
Rok vydání: | 2017 |
Předmět: |
060201 languages & linguistics
SQL Shuffling Computer science Node (networking) Fiber (computer science) 06 humanities and the arts 02 engineering and technology computer.software_genre 0602 languages and literature Spark (mathematics) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Upstream (networking) Granularity Data mining Downstream (networking) computer computer.programming_language |
Zdroj: | Web and Big Data ISBN: 9783319635637 APWeb/WAIM (2) |
DOI: | 10.1007/978-3-319-63564-4_34 |
Popis: | Interactive entity centric analysis of log data can help us gain fine granularity insights on business. In this paper, firstly we describe a fiber based partitioning method for log data, which accelerate later entity centric analysis. Secondly, we present our fiber based partitioner which is used by Spark SQL query engine. Fiber based partitioner takes locations of data blocks into account when loading data from HDFS into RDD, and when shuffling data from upstream operators to downstream operators during joining, avoids data interchange between node and speeds up query processing. Finally, we present our experiment results which demonstrates that fiber based partitioner improve entity centric queries. |
Databáze: | OpenAIRE |
Externí odkaz: |