Classification of dry-beans using synthetic minority over-sampling technique and stochastic gradient boosting machines.

Autor: Koeshardianto, Meidya, Permana, Kurniawan Eka, Kartika, Dhian Satria Yudha, Setiawan, Wahyudi
Předmět:
Zdroj: AIP Conference Proceedings; 2024, Vol. 3176 Issue 1, p1-8, 8p
Abstrakt: Dry-beans classification aims to determine the physical features of each class. This research classification was based on seven dry-bean categories: Barbunya, Bombay, Cali, Horoz, Dermason, Seker, and Sira. The data is 13,611, which contains information on 16 morphological features. The amount of data for each class needs to be balanced. Therefore, the research steps involve preprocessing by balancing the data using the Synthetic Minority Over-Sampling Technique (SMOTE). After carrying out balanced data, the classification process uses Stochastic Gradient Boosting Machines (SGBM). This method is a development of the Gradient Boosting Tree, which takes the best results from the boosting process. The test scenario consists of three models: imbalanced class, balanced class with SMOTE, and normalization using a min-max scaler. Testing also compares with other machine learning methods. The results show that testing using SMOTE and SGBM leads to the best results with an accuracy of 95.38%, precision of 95.46%, recall of 95.44%, and f1-score of 95.48%. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index