A Novel Two-Step Classification Approach for Runtime Performance Improvement of Duplicate Bug Report Detection

Autor: Behzad Soleimani Neysiani, Seyed Morteza Babamir
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Computer and Knowledge Engineering, Vol 6, Iss 1, Pp 1-14 (2023)
Druh dokumentu: article
ISSN: 2538-5453
2717-4123
DOI: 10.22067/cke.2022.63258.0
Popis: Duplicate Bug Report Detection (DBRD) is one of the famous problems in software triage systems like Bugzilla. There are two main approaches to this problem, including information retrieval and machine learning. The second one is more effective for validation performance. Duplicate detection needs feature extraction, which is a time-consuming process. Both approaches suffer runtime issues, because they should check the new bug report to all bug reports in the repository, and it takes a long time for feature extraction and duplicate detection. This study proposes a new two-step classification approach which tries to reduce the search space of the bug repository search space in the first step and then check the duplicate detection using textual features. The Mozilla and Eclipse datasets are used for experimental evaluation. The results show that overall, 87.70% and 89.01% validation performance achieved averagely for accuracy and F1-measure, respectively. Moreover, 95.85% and 87.65% of bug reports can be classified in step one very fast for Eclipse and Mozilla datasets, respectively, and the other one needs textual feature extraction until it can be checked by the traditional DBRD approach. An average of 90% runtime improvement is achieved using the proposed method.
Databáze: Directory of Open Access Journals