Real-Time Snapshot Maintenance with Incremental ETL Pipelines in Data Warehouses
Autor: | Vinanthi Basavaraj, Weiping Qu, Sahana Shankar, Stefan Dessloch |
---|---|
Rok vydání: | 2015 |
Předmět: | |
Zdroj: | Big Data Analytics and Knowledge Discovery ISBN: 9783319227283 DaWaK |
DOI: | 10.1007/978-3-319-22729-0_17 |
Popis: | Multi-version concurrency control method has nowadays been widely used in data warehouses to provide OLAP queries and ETL maintenance flows with concurrent access. A snapshot is taken on existing warehouse tables to answer a certain query independently of concurrent updates. In this work, we extend this snapshot with the deltas which reside at the source side of ETL flows. Before answering a query, relevant tables are first refreshed with the exact source deltas which are captured at the time this query arrives (so-called query-driven policy). Snapshot maintenance is done by an incremental recomputation pipeline which is flushed by a set of consecutive deltas belonging to a sequence of incoming queries. A workload scheduler is thereby used to achieve a serializable schedule of concurrent maintenance tasks and OLAP queries. Performance has been examined by using read-/update-heavy workloads. |
Databáze: | OpenAIRE |
Externí odkaz: |