Popis: |
Background: Developers use Static Analysis Tools (SATs) to control for potential quality issues in source code, including defects and technical debt. Tool vendors have devised quite a number of tools, which makes it harder for practitioners to select the most suitable one for their needs. To better support developers, researchers have been conducting several studies on SATs to favor the understanding of their actual capabilities. Aims: Despite the work done so far, there is still a lack of knowledge regarding (1) what is their agreement, and (2) what is the precision of their recommendations. We aim at bridging this gap by proposing a large-scale comparison of six popular SATs for Java projects: Better Code Hub, CheckStyle, Coverity Scan, FindBugs, PMD, and SonarQube. Methods: We analyze 47 Java projects applying 6 SATs. To assess their agreement, we compared them by manually analyzing – at line – and class-level — whether they identify the same issues. Finally, we evaluate the precision of the tools against a manually-defined ground truth. Results: The key results show little to no agreement among the tools and a low degree of precision. Conclusion: Our study provides the first overview on the agreement among different tools as well as an extensive analysis of their precision that can be used by researchers, practitioners, and tool vendors to map the current capabilities of the tools and envision possible improvements. publishedVersion |