Analyzing the Impact of Ensemble Techniques and Resampling Techniques Over Multi Class Skewed Datasets

Autor: Rose Mary Mathew, Gunasundari R
Rok vydání: 2022
Zdroj: Advances in Intelligent Systems and Technologies. :1-13
DOI: 10.53759/aist/978-9914-9946-0-5_1
Popis: Machine Learning is having great importance in this era, since of its board spectrum of applications and its capability to adjust and give solutions to complex problems reliably, rapidly, and productively. Machine learning models trained with the data from past experiences and based on the learned data it produces outcomes. The data used for training with these machine learning models should be in balanced manner otherwise the model gives incorrect results. Data is having an important role in this scenario, and it is evident that most of the data are skewed towards some classes and this kind of skewness can be found in all sectors of data in real world. Multimajority datasets and multiminority datasets are the different types of imbalances viewed in multiclass datasets. In this study three different datasets from multimajority domain and three different datasets from multiminority domain are analysed. Six different resampling procedure were applied out of which three belongs to undersampling and three belongs to oversampling. Four different classifiers K-NN, SVM, Random Forest and XGBoost were used to create the various models and their performance were analysed in this study.
Databáze: OpenAIRE