Abstrakt: |
Parsers are omnipresent in almost all software systems. However, an operational implementation of parsers cannot answer many "how", "why" and "what if" questions, why does this parser not accept this string, or, can we have to parser for it to accept a set of strings? To lift the parsing theory to answer such questions, we build a symbolic encoding of parsing. Our symbolic encoding, that we referred to as a parse condition, is a logical condition that is satisfiable if and only if a given string w can be successfully parsed using a grammar G . We build an SMT encoding of such parse conditions for LL(1) grammars and demonstrate its utility by building three applications over it: automated repair of syntax errors in Tiger programs, automated parser synthesis to automatically synthesize LL(1) parsers from examples, and automatic root cause analysis of parser construction to debug errors in parse tables. We implement our ideas into a tool, Cyclops , that successfully repairs 80% of our benchmarks (675 buggy Tiger programs), clocking an average of 30 s per repair, synthesize parsers for interesting languages from examples, and ranks the ground truth as the topmost ranked reason for failure in more than 85% of our instances. On our deployment of Cyclops in a compiler design course, 91 of 128 students gave a positive response stating that assistance from Cyclops was indeed useful. [ABSTRACT FROM AUTHOR] |