srcDiff: Syntactic Differencing to Support Software Maintenance and Evolution

Autor: Decker, Michael John
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Druh dokumentu: Text
Popis: The dissertation presents the construction and evaluation of an efficient and scalable rule-based syntactic-differencing approach and its application to several software engineering research tasks. The approach, called srcDiff, is built upon the srcML infrastructure. srcML adds abstract syntactic information into the code via an XML format. A syntactic difference of srcML documents is then computed. During this process, the differences are further refined using a set of rules that model typical editing patterns of source code by developers. Thus, the resulting deltas model edits to the code that are programmer centric versus a purely syntactic tree edit view. Other syntactic differencing approaches focus on obtaining an optimal tree-edit distance with the assumption that this will produce an accurate difference. By contrast, srcDiff, purposely deviates from an optimal tree-difference in order to create a delta that is both easier to understand and better models changes between the original and modified.To evaluate the approach, a comparison study against a state-of-the-art syntactic differencing approach and two line-based differencing tools is conducted as an online within-participant study of about 70 subjects. The results show that the rule-based syntactic-differencing approach produces more accurate and understandable deltas. srcDiff is utilized to analyze the complete history of fifty open-source software systems for a number of software engineering research tasks. The first task is an investigation into the stability and consequences of changes in a method’s stereotype. The results show that a method’s stereotype is very stable and that certain classes of stereotype changes are indicators of poor method design and inappropriate changes to methods. The second task is an investigation into how often developers rewrite code and the consequences. The results show that developers frequently rewrite code at the text, expression, and statement level, and that the replacement of functions is not uncommon. The last study investigates how often developers convert code from one type to another. The results show that conversion occurs most frequently between declaration-statements, expression-statements, and return-statements. The consequence for both code replacement and code conversion is that a syntactic differencing approach must support both to arrive at meaningful and accurate deltas.
Databáze: Networked Digital Library of Theses & Dissertations