Distributed Mining of Spatial High Utility Itemsets in Very Large Spatiotemporal Databases using Spark In-Memory Computing Architecture

Autor: Truong Cong Thang, R. Uday Kiran, Yukata Watanobe, Incheon Paik, Cheng-Wei Wu, Koji Zettsu, Minh-Son Dao, Sadanori Ito
Rok vydání: 2020
Předmět:
Zdroj: IEEE BigData
Popis: Finding Spatial High Utility Itemsets (SHUIs) in a spatiotemporal database is a challenging problem of great importance in many real-world applications. Most previous works focused on the sequential discovery of SHUIs in a database running on a single machine. Consequently, these works are not suitable for big data (or cloud-based) applications as they suffer from the scalability and fault tolerant problems. This paper proposes several novel pruning techniques to reduce the search space and present a more flexible distributed algorithm to find all desired itemsets from the database using Spark in-memory computing architecture. Our algorithm inherits several advantages of Spark, including low communication cost, fault tolerance, and high scalability. Experimental results demonstrate that the proposed algorithm has good scalability and performance on very large databases. Finally, we present a real-world navigation application in which SHUIs generated from the traffic congestion data have been employed to recommend alternative routes to the users.
Databáze: OpenAIRE