Dynamic Virtual Chunks: On Supporting Efficient Accesses to Compressed Scientific Data
Autor: | Dongfang Zhao, Jian Yin, Ioan Raicu, Kan Qiao |
---|---|
Rok vydání: | 2016 |
Předmět: |
020203 distributed computing
Information Systems and Management Distributed database Computer Networks and Communications Computer science Data_CODINGANDINFORMATIONTHEORY 02 engineering and technology Parallel computing Electronic mail Computer Science Applications Hardware and Architecture Compression (functional analysis) Compression ratio 0202 electrical engineering electronic engineering information engineering Overhead (computing) Data-intensive computing 020201 artificial intelligence & image processing Block (data storage) Data compression |
Zdroj: | IEEE Transactions on Services Computing. 9:96-109 |
ISSN: | 2372-0204 |
DOI: | 10.1109/tsc.2015.2456889 |
Popis: | Data compression could ameliorate the I/O pressure of data-intensive scientific applications. Unfortunately, the conventional wisdom of naively applying data compression to the file or block brings the dilemma between efficient random accesses and high compression ratios. File-level compression barely supports efficient random accesses to the compressed data: any retrieval request need trigger the decompression from the beginning of the compressed file. Block-level compression provides flexible random accesses to the compressed blocks, but introduces extra overhead when applying the compressor to each and every block that results in a degraded overall compression ratio. This paper extends our prior work that introduces virtual chunks offering efficient random accesses to the compressed scientific data without sacrificing the compression ratio. Virtual chunks are logical blocks pointed at by appended references without breaking the physical continuity of the file content. These references allow the decompression to start from an arbitrary position (efficient random accesses), while no per-block overhead is introduced because the file's physical entirety is retained (high compression ratio). One limitation of virtual chunk is it only supports static references. This paper presents the algorithms, analysis, and evaluations of dynamic virtual chunks to deal with the cases where the references are updated dynamically. |
Databáze: | OpenAIRE |
Externí odkaz: |