Code generation for efficient query processing in managed runtimes

Autor: Stratis D. Viglas, Fabian Nagel, Gavin Bierman
Rok vydání: 2014
Předmět:
Zdroj: Nagel, F, Bierman, G M & Viglas, S D 2014, ' Code Generation for Efficient Query Processing in Managed Runtimes ', Proceedings of the VLDB Endowment (PVLDB), vol. 7, no. 12, pp. 1095-1106 . < http://www.vldb.org/pvldb/vol7/p1095-nagel.pdf >
ISSN: 2150-8097
DOI: 10.14778/2732977.2732984
Popis: In this paper we examine opportunities arising from the convergence of two trends in data management: in-memory database systems (imdbs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous 'impedance mismatch' problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like imdbs, but also to use the same querying language to query an application's in-memory collections. The latter offers further transparency to developers as the query language and all data is represented in the data model of the host programming language. However, compared to imdbs, this additional freedom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to improve query processing on application objects. We explore different query compilation strategies and study how they improve the performance of query processing over application data. We take C# as the host programming language as it supports language-integrated query through the linq framework. Our techniques deliver significant performance improvements over the default linq implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and imdb-inspired runtimes to provide transparent and highly efficient querying.
Databáze: OpenAIRE