Chi-BD-DRF: Design of Scalable Fuzzy Classifiers for Big Data via A Dynamic Rule Filtering Approach
Autor: | Mohammad Masoud Javidi, Isaac Triguero, Alberto Fernández, Fatemeh Aghaeipoor |
---|---|
Rok vydání: | 2020 |
Předmět: |
Fuzzy rule
Computer science business.industry Big data 02 engineering and technology Fuzzy control system computer.software_genre Fuzzy logic 020204 information systems Outlier Scalability 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining business computer Interpretability |
Zdroj: | FUZZ-IEEE |
DOI: | 10.1109/fuzz48607.2020.9177626 |
Popis: | Big data classification problems are known to be no longer addressable by sequential algorithms. Therefore, it is necessary to design and develop novel solutions to provide accurate yet interpretable models in a tolerable elapsed time. In this area, Fuzzy Rule-Based Classification Systems are very advantageous due to their intrinsic interpretable and accurate capabilities. However, when these systems are applied in Big Data scenarios, the size of the rule set can become too large to be useful, whereas many of the generated rules could be associated with the non-dense areas or outliers. The presence of such rules in the rule base not only increases the running time and computation overheads but also affects on the interpretability of the fuzzy system. In this contribution, we propose a novel approach to obtain compact and accurate fuzzy models for Big data problems in a linearly scalable complex time. To do so, a dynamic filtering approach is applied to remove low supporting rules. Moreover, an efficient computation of the rules’ weights is presented to improve the accuracy of the predictions. This model is developed for Big Data analytics by using Apache Spark framework. This allows taking advantage of the built-in resources and directives for a transparent distributed computing, as well as the machine learning pipeline to ease the complete processing. Experimental results, using different Big Data problems, confirmed the goodness of the proposed algorithm with respect to the baseline fuzzy classifier. |
Databáze: | OpenAIRE |
Externí odkaz: |