Revisiting reuse for approximate query processing

Autor: Emanuel Zgraggen, Carsten Binnig, Alex Galakatos, Andrew Crotty, Tim Kraska
Rok vydání: 2017
Předmět:
Zdroj: Proceedings of the VLDB Endowment. 10:1142-1153
ISSN: 2150-8097
DOI: 10.14778/3115404.3115418
Popis: Visual data exploration tools allow users to quickly gather insights from new datasets. As dataset sizes continue to increase, though, new techniques will be necessary to maintain the interactivity guarantees that these tools require. Approximate query processing (AQP) attempts to tackle this problem and allows systems to return query results at "human speed." However, existing AQP techniques start to break down when confronted with ad hoc queries that target the tails of the distribution. We therefore present an AQP formulation that can provide low-error approximate results at interactive speeds, even for queries over rare subpopulations. In particular, our formulation treats query results as random variables in order to leverage the ample opportunities for result reuse inherent in interactive data exploration. As part of our approach, we apply a variety of optimization techniques that are based on probability theory, including new query rewrite rules and index structures. We implemented these techniques in a prototype system and show that they can achieve interactivity where alternative approaches cannot.
Databáze: OpenAIRE