A Distributed Decision Generation Algorithm based on Granular Computing Using Spark

Autor: Zi-Yan Lin, 林子晏
Rok vydání: 2016
Druh dokumentu: 學位論文 ; thesis
Popis: 105
The DGAGC algorithm, developed by National Central University, is a classification algorithm based on association-rule mining and searching. The DGAGC algorithm also specifies a distributed computing approach for model training, which is implemented on top of Hadoop MapReduce. In this study, we propose a new distributed computing approach for the DGAGC algorithm based on Apache Spark. With the support of in-memory computing by Spark, the new distributed DGAGC algorithm can achieve less average execution time for model training, given four different training data sets. In addition, we also propose a distributed version of the DGAGC for data classification.
Databáze: Networked Digital Library of Theses & Dissertations