Synthesizing highly expressive SQL queries from input-output examples
Autor: | Chenglong Wang, Alvin Cheung, Rastislav Bodik |
---|---|
Rok vydání: | 2017 |
Předmět: |
SQL
Computer science Relational database Programming language InformationSystems_DATABASEMANAGEMENT 020207 software engineering 02 engineering and technology Data Transformation Services computer.software_genre Computer Graphics and Computer-Aided Design Spatial query Null (SQL) SQL injection 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Conjunctive query Query by Example Stored procedure computer Software computer.programming_language |
Zdroj: | PLDI |
ISSN: | 1558-1160 0362-1340 |
DOI: | 10.1145/3140587.3062365 |
Popis: | SQL is the de facto language for manipulating relational data. Though powerful, many users find it difficult to write SQL queries due to highly expressive constructs. While using the programming-by-example paradigm to help users write SQL queries is an attractive proposition, as evidenced by online help forums such as Stack Overflow, developing techniques for synthesizing SQL queries from given input-output (I/O) examples has been difficult, due to the large space of SQL queries as a result of its rich set of operators. In this paper, we present a new scalable and efficient algorithm for synthesizing SQL queries based on I/O examples. The key innovation of our algorithm is development of a language for abstract queries, i.e., queries with uninstantiated operators, that can be used to express a large space of SQL queries efficiently. Using abstract queries to represent the search space nicely decomposes the synthesis problem into two tasks: 1) searching for abstract queries that can potentially satisfy the given I/O examples, and 2) instantiating the found abstract queries and ranking the results. We have implemented this algorithm in a new tool called Scythe and evaluated it using 193 benchmarks collected from Stack Overflow. Our evaluation shows that Scythe can efficiently solve 74% of the benchmarks, most in just a few seconds, and the queries range from simple ones involving a single selection to complex queries with 6 nested subqueires. |
Databáze: | OpenAIRE |
Externí odkaz: |