Performance Evaluation of Some Machine Learning Algorithms in NASA Defect Prediction Data Sets

Autor: Zeynep Behrin Guven Aydin, Ruya Samli
Rok vydání: 2020
Předmět:
Zdroj: 2020 5th International Conference on Computer Science and Engineering (UBMK).
DOI: 10.1109/ubmk50275.2020.9219531
Popis: The main purpose of machine learning is to model the systems making predictions by using some mathematical and operational features on the data with computers [1]. Today, there are many studies on machine learning in all areas of the software world. Software Defect Prediction is a sub-branch that progresses rapidly in machine learning. In this study, five of machine learning classification algorithms were conducted on with PYTHON programming language on defect prediction data sets which are JM1, KC1, CM1, PC1 in the PROMISE repository. These data sets are created within the scope of the publicly available NASA institution's Metric Data Program. The accuracy, recall, precision and F-measure and support values of the algorithms on the data are compared. When the results are examined in terms of the accuracy of machine learning algorithms, the accuracy rates of the algorithms are quite high in all 4 data sets. The highest success rates were obtained from the classification algorithms applied in 4 data sets in CMl and PCI data sets. In 4 data sets, the highest success rates were seen with Random Forest algorithm.
Databáze: OpenAIRE