Incorporating statistical and machine learning techniques into the optimization of correction factors for software development effort estimation.

Autor: Nhung, Ho Le Thi Kim, Van Hai, Vo, Silhavy, Petr, Prokopova, Zdenka, Silhavy, Radek
Předmět:
Zdroj: Journal of Software: Evolution & Process; May2024, Vol. 36 Issue 5, p1-37, 37p
Abstrakt: Accurate effort estimation is necessary for efficient management of software development projects, as it relates to human resource management. Ensemble methods, which employ multiple statistical and machine learning techniques, are more robust, reliable, and accurate effort estimation techniques. This study develops a stacking ensemble model based on optimization correction factors by integrating seven statistical and machine learning techniques (K‐nearest neighbor, random forest, support vector regression, multilayer perception, gradient boosting, linear regression, and decision tree). The grid search optimization method is used to obtain valid search ranges and optimal configuration values, allowing more accurate estimation. We conducted experiments to compare the proposed method with related methods, such as use case points‐based single methods, optimization correction factors‐based single methods, and ensemble methods. The estimation accuracies of the methods were evaluated using statistical tests and unbiased performance measures on a total of four datasets, thus demonstrating the effectiveness of the proposed method more clearly. The proposed method successfully maintained its estimation accuracy across the four experimental datasets and gave the best results in terms of the sum of squares errors, mean absolute error, root mean square error, mean balance relative error, mean inverted balance relative error, median of magnitude of relative error, and percentage of prediction (0.25). The p‐value for the t‐test showed that the proposed method is statistically superior to other methods in terms of estimation accuracy. The results show that the proposed method is a comprehensive approach for improving estimation accuracy and minimizing project risks in the early stages of software development. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index