Improving Marlin's compression ratio with partially overlapping codewords
Autor: | Manuel Martinez, Kai Sandfort, Joan Serra-Sagrista, Danny Dube |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Zdroj: | Recercat. Dipósit de la Recerca de Catalunya instname DCC Recercat: Dipósit de la Recerca de Catalunya Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya) |
Popis: | Marlin [1] is a Variable-to-Fixed (VF) codec optimized for decoding speed. To achieve its speed, Marlin does not encode the current state of the input source, penalyzing compression ratio. In this paper we address this penalty by partially encoding the current state of the input in the lower bits of the codeword. Those bits select which chapter in the dictionary must be used to decode the next codeword. Each chapter is specialized for a subset of states, improving compression ratio. At the same time, we use one victim chapter to encode all rare symbols, increasing the efficiency of the rest of them. The decoding algorithm remains the same, only now codewords have overlapping bits. Mapping techniques allow us to combine common chapters and thus keep an efficient use of the L1 cache. We evaluate our approach with both synthetic and real data sets, and show significant improvements in low entropy sources, where compression efficiency can improve from 93.9% to 98.6%. |
Databáze: | OpenAIRE |
Externí odkaz: |