Software fault prediction using machine learning techniques with metric thresholds

Autor: Raed Shatnawi
Rok vydání: 2021
Předmět:
Zdroj: International Journal of Knowledge-based and Intelligent Engineering Systems. 25:159-172
ISSN: 1875-8827
1327-2314
Popis: BACKGROUND: Fault data is vital to predicting the fault-proneness in large systems. Predicting faulty classes helps in allocating the appropriate testing resources for future releases. However, current fault data face challenges such as unlabeled instances and data imbalance. These challenges degrade the performance of the prediction models. Data imbalance happens because the majority of classes are labeled as not faulty whereas the minority of classes are labeled as faulty. AIM: The research proposes to improve fault prediction using software metrics in combination with threshold values. Statistical techniques are proposed to improve the quality of the datasets and therefore the quality of the fault prediction. METHOD: Threshold values of object-oriented metrics are used to label classes as faulty to improve the fault prediction models The resulting datasets are used to build prediction models using five machine learning techniques. The use of threshold values is validated on ten large object-oriented systems. RESULTS: The models are built for the datasets with and without the use of thresholds. The combination of thresholds with machine learning has improved the fault prediction models significantly for the five classifiers. CONCLUSION: Threshold values can be used to label software classes as fault-prone and can be used to improve machine learners in predicting the fault-prone classes.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje