Revisiting reuse for approximate query processing
Autor: | Emanuel Zgraggen, Carsten Binnig, Alex Galakatos, Andrew Crotty, Tim Kraska |
---|---|
Rok vydání: | 2017 |
Předmět: |
Theoretical computer science
Computer science General Engineering 020207 software engineering 02 engineering and technology Reuse Query optimization computer.software_genre Variety (cybernetics) Index (publishing) 020204 information systems 0202 electrical engineering electronic engineering information engineering Leverage (statistics) Data mining computer |
Zdroj: | Proceedings of the VLDB Endowment. 10:1142-1153 |
ISSN: | 2150-8097 |
DOI: | 10.14778/3115404.3115418 |
Popis: | Visual data exploration tools allow users to quickly gather insights from new datasets. As dataset sizes continue to increase, though, new techniques will be necessary to maintain the interactivity guarantees that these tools require. Approximate query processing (AQP) attempts to tackle this problem and allows systems to return query results at "human speed." However, existing AQP techniques start to break down when confronted with ad hoc queries that target the tails of the distribution. We therefore present an AQP formulation that can provide low-error approximate results at interactive speeds, even for queries over rare subpopulations. In particular, our formulation treats query results as random variables in order to leverage the ample opportunities for result reuse inherent in interactive data exploration. As part of our approach, we apply a variety of optimization techniques that are based on probability theory, including new query rewrite rules and index structures. We implemented these techniques in a prototype system and show that they can achieve interactivity where alternative approaches cannot. |
Databáze: | OpenAIRE |
Externí odkaz: |