Incrementally maintaining run-length encoded attributes in column stores

Autor: Abhijeet Mohapatra, Michael R. Genesereth
Rok vydání: 2012
Předmět:
Zdroj: IDEAS
DOI: 10.1145/2351476.2351493
Popis: Run-length encoding is a popular compression scheme which is used extensively to compress the attribute values in column stores. Out of order insertion of tuples potentially degrades the compression achieved using run-length encoding and consequently, the performance of reads. The in-place insertions, deletions and updates of tuples into a column store relation with n tuples take O(n) time. The linear cost is typically avoided by amortizing the cost of updates in batches. However, the relation is decompressed and subsequently re-compressed after applying a batch of updates. This leads to added time time complexity. We propose a novel indexing scheme called count indexes that supports O(log n) in-place insertions, deletions, updates and look ups on a run-length encoded sequence with n runs. We also show that count indexes efficiently update a batch of tuples requiring almost a constant time per updated tuple. Additionally, we show that count indexes are optimal. We extend count indexes to support O(log n) updates on bitmapped sequences with n values and adapt them to block-based stores.
Databáze: OpenAIRE