Cross-Project Software Defect Prediction Based on Class Code Similarity

Autor: Wanzhi Wen, Chenqiang Shen, Xiaohong Lu, Zhixian Li, Haoren Wang, Ruinian Zhang, Ningbo Zhu
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: IEEE Access, Vol 10, Pp 105485-105495 (2022)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2022.3211401
Popis: Software defect prediction techniques can help software developers find software defects as soon as possible. It can also reduce the cost of software development. This technique usually predicts the target project through the entire source project. However, the data distribution difference between the entire source project and the target project is generally large, so the software defect prediction accuracy is not high. we propose a cross-project software defect prediction technique based on class code similarity CCS-CPDP. Firstly, this technique converts the code set extracted by AST(Abstract Syntax Tree) into a vector set through the DTI (Doc2Bow and TF-IDF) strategy; Secondly, the similarity will be calculated between the vector set of target projects and training projects; Finally, according to the principle of the majority decision subordinate category in KNN, the number of most similar class instances of the training project is determined, the source project is refined by selecting the class instance, and then software defects are predicted and evaluated. We compared CCS-CPDP with softawre defect prediction methods based on four traditional classification models (KNN, Random Forest, Naive Bayes, and Logistic Regression). Experimental results show that CCS-CPDP can improve the effectiveness of CPDP in terms of recall and f1-score.
Databáze: Directory of Open Access Journals