Combining Sampling and Ensemble Classifier for Multiclass Imbalance Data Learning
Autor: | Faudziah Ahmad, Fairuz Adnan, Rayner Alfred, Mohd Shamrie Sainin |
---|---|
Rok vydání: | 2018 |
Předmět: |
business.industry
Computer science Pattern recognition 02 engineering and technology Ensemble learning Random forest Naive Bayes classifier C4.5 algorithm 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business Classifier (UML) |
Zdroj: | Lecture Notes in Electrical Engineering ISBN: 9789811082757 |
DOI: | 10.1007/978-981-10-8276-4_25 |
Popis: | The aim of this paper is to investigate the effects of combining various sampling and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning. This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large benchmark datasets in which seven ensemble methods from Weka machine learning tool were selected to perform the classification task. These ensemble methods include the AdaboostM1, Bagging, Decorate, END, MultiboostAB, RotationForest, and stacking methods. In addition to that, five base classifiers were used; Naive Bayes, SMO, J48, Random Forest, and Random Tree in order to examine the performance of the ensemble methods. Two methods of combining the sampling and ensemble classifiers were used which are called the Resample with ensemble classifier and SMOTE with ensemble classifier. The results obtained from the experiments show that there is actually no single configuration that is “one design that fits all”. However, it is proven that when using the sampling and ensemble classifier which is coupled with Random Forest, the prediction performance of the classification task can be improved on the multiclass imbalance dataset. |
Databáze: | OpenAIRE |
Externí odkaz: |