Abstrakt: |
Data stream mining is the process of generating continuous data stream records such as internet search, phone conversations, sensor data, etc. However it performs huge tasks such as frequency counting, clustering, analysis as well as classification. Mining information from data streams is often considered as a complicated process due to the rapid change in the underlying concept which is often referred to as concept drift and the high speed of data arrival. Moreover the data stream classification process is not stationary where each transmission is evolved with time. In addition to this, it cannot able to handle imbalanced data and is not able to accommodate new classes. To overcome this problem, an Ensemble Learning model based Support Vector Machine (ESVM) is proposed to perform the data stream classification. To achieve higher diversity, each base SVM is trained with different feature subsets and updated during the presence of new data instances. However, the selection of optimal feature subsets from high dimensional data streams is complex due to the increase in size and computational cost. Hence Dynamic Accelerated Function (DAF) and Dynamic Candidate Solution (DCS) approaches are developed that diminish the classification error and improve the performance with the best fitness value. The performances of the proposed methods is validated based on accuracy, precision, F-score, kappa, and relative error. The experimental result demonstrates that the proposed model is efficient when evaluated in terms of classification accuracy, rapid training, processing time, kappa score and attained an accuracy of 91.45%. |