Dynamic Virtual Chunks: On Supporting Efficient Accesses to Compressed Scientific Data

Autor: Dongfang Zhao, Jian Yin, Ioan Raicu, Kan Qiao
Rok vydání: 2016
Předmět:
Zdroj: IEEE Transactions on Services Computing. 9:96-109
ISSN: 2372-0204
DOI: 10.1109/tsc.2015.2456889
Popis: Data compression could ameliorate the I/O pressure of data-intensive scientific applications. Unfortunately, the conventional wisdom of naively applying data compression to the file or block brings the dilemma between efficient random accesses and high compression ratios. File-level compression barely supports efficient random accesses to the compressed data: any retrieval request need trigger the decompression from the beginning of the compressed file. Block-level compression provides flexible random accesses to the compressed blocks, but introduces extra overhead when applying the compressor to each and every block that results in a degraded overall compression ratio. This paper extends our prior work that introduces virtual chunks offering efficient random accesses to the compressed scientific data without sacrificing the compression ratio. Virtual chunks are logical blocks pointed at by appended references without breaking the physical continuity of the file content. These references allow the decompression to start from an arbitrary position (efficient random accesses), while no per-block overhead is introduced because the file's physical entirety is retained (high compression ratio). One limitation of virtual chunk is it only supports static references. This paper presents the algorithms, analysis, and evaluations of dynamic virtual chunks to deal with the cases where the references are updated dynamically.
Databáze: OpenAIRE