Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud

Autor: A. Rajiv Kannan, V. Vennila
Rok vydání: 2019
Předmět:
Zdroj: International Journal of Fuzzy Systems. 21:809-822
ISSN: 2199-3211
1562-2479
DOI: 10.1007/s40815-018-0597-x
Popis: With the increasing availability of large amount of information and the benefits related to data processing, big data have gained large significance in recent years. With scalable nature of data, big data applications are processed using MapReduce programming model. However, the application of rule-based models in datasets is not straightforward and big data are not classified in an efficient manner. To overcome the above-mentioned problems, parallel linguistic fuzzy rule with canopy MapReduce (LFR-CM) framework is introduced. LFR-CM framework classifies big data using canopy MapReduce function for information sharing in cloud with higher classification accuracy and lesser time consumption. It comprises three steps for efficient classification in cloud environment. Initially, it constructs the fuzzy knowledge base (KB) from the big data training set where linguistic fuzzy rules are constructed. The second step in LFR-CM framework has three operations. The first operation is map function used in parallel manner through every cloud user without transmitting any data to other cloud user nodes. The second operation is processing of data through the map function across all additional cloud user nodes. The third operation is reduce function deployed by each cloud user through the partitioned information. Finally, by this way, the data classification is performed with higher classification accuracy and lesser time consumption. LFR-CM framework is implemented and evaluated on Amazon EC2 cloud big data datasets and compared with the other classification system that utilizes MapReduce in terms of the runtime, classification time, classification accuracy and input/output cost. Based on the results observed from the study, LFR-CM framework is more efficient than the existing methods.
Databáze: OpenAIRE