Popis: |
Cross-project defect prediction (CPDP) is a field of study that allows predicting defects in software projects for which the availability of data is limited and produces generalizable prediction models. Due to the heterogeneity of cross projects, CPDP is particularly challenging and several methods have been employed to address this problem. Nevertheless, the class-imbalanced characteristic of the cross-project defect data also increases the learning difficulty of such a task but has not been investigated in depth. This paper proposed a novel, cost-cognitive ensemble method for CPDP, which includes four phases: bagging balanced resampling phase, base classifiers learning phase, cost value cognitive phase, and base classifiers ensemble phase. These phases create a composition of classifiers that are used for predicting defects. Results of an empirical evaluation on 10 datasets from the PROMISE repository indicated that our method achieves the best overall performance with respect to conventional methods. Moreover, our method could cognize the cost value automatically during the model training, it is shown to be more effective and practical.   |