On Combining IR Methods to Improve Bug Localization
Autor: | Miroslav Tushev, Saket Khatiwada, Anas Mahmoud |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
business.industry 020207 software engineering 02 engineering and technology Machine learning computer.software_genre 01 natural sciences 010104 statistics & probability Similarity (network science) Face (geometry) 0202 electrical engineering electronic engineering information engineering Benchmark (computing) Software debugging Artificial intelligence 0101 mathematics Precision and recall business computer Cognitive load |
Zdroj: | ICPC |
DOI: | 10.1145/3387904.3389280 |
Popis: | Information Retrieval (IR) methods have been recently employed to provide automatic support for bug localization tasks. However, for an IR-based bug localization tool to be useful, it has to achieve adequate retrieval accuracy. Lower precision and recall can leave developers with large amounts of incorrect information to wade through. To address this issue, in this paper, we systematically investigate the impact of combining various IR methods on the retrieval accuracy of bug localization engines. The main assumption is that different IR methods, targeting different dimensions of similarity between artifacts, can be used to enhance the confidence in each others' results. Five benchmark systems from different application domains are used to conduct our analysis. The results show that a) near-optimal global configurations can be determined for different combinations of IR methods, b) optimized IR-hybrids can significantly outperform individual methods as well as other unoptimized methods, and c) hybrid methods achieve their best performance when utilizing information-theoretic IR methods. Our findings can be used to enhance the practicality of IR-based bug localization tools and minimize the cognitive overload developers often face when locating bugs. |
Databáze: | OpenAIRE |
Externí odkaz: |