Zobrazeno 1 - 10
of 71
pro vyhledávání: '"Kempa, Dominik"'
Autor:
Kempa, Dominik, Kociumaka, Tomasz
Lempel-Ziv (LZ77) factorization is a fundamental problem in string processing: Greedily partition a given string $T$ from left to right into blocks (called phrases) so that each phrase is either the leftmost occurrence of a letter or the longest pref
Externí odkaz:
http://arxiv.org/abs/2409.12146
Autor:
Kempa, Dominik, Kociumaka, Tomasz
In the last decades, the necessity to process massive amounts of textual data fueled the development of compressed text indexes: data structures efficiently answering queries on a given text while occupying space proportional to the compressed repres
Externí odkaz:
http://arxiv.org/abs/2308.03635
Autor:
De, Rajat, Kempa, Dominik
Grammar compression is a general compression framework in which a string $T$ of length $N$ is represented as a context-free grammar of size $n$ whose language contains only $T$. In this paper, we focus on studying the limitations of algorithms and da
Externí odkaz:
http://arxiv.org/abs/2307.08833
Autor:
Kempa, Dominik, Kociumaka, Tomasz
The suffix array $SA[1..n]$ of a text $T$ of length $n$ is a permutation of $\{1,\ldots,n\}$ describing the lexicographical ordering of suffixes of $T$, and it is considered to be among of the most important data structures in string algorithms, with
Externí odkaz:
http://arxiv.org/abs/2201.01285
Autor:
Kempa, Dominik, Kociumaka, Tomasz
The suffix array and the suffix tree are the two most fundamental data structures for string processing. For a length-$n$ text, however, they use $\Theta(n \log n)$ bits of space, which is often too costly. To address this, Grossi and Vitter [STOC 20
Externí odkaz:
http://arxiv.org/abs/2106.12725
Autor:
Kempa, Dominik, Langmead, Ben
Grammar compression is, next to Lempel-Ziv (LZ77) and run-length Burrows-Wheeler transform (RLBWT), one of the most flexible approaches to representing and processing highly compressible strings. The main idea is to represent a text as a context-free
Externí odkaz:
http://arxiv.org/abs/2105.11052
Autor:
Kempa, Dominik, Kociumaka, Tomasz
The Burrows-Wheeler Transform (BWT) is an invertible text transformation that permutes symbols of a text according to the lexicographical order of its suffixes. BWT is the main component of popular lossless compression programs (such as bzip2) as wel
Externí odkaz:
http://arxiv.org/abs/1910.10631
Autor:
Kempa, Dominik, Kociumaka, Tomasz
Burrows-Wheeler transform (BWT) is an invertible text transformation that, given a text $T$ of length $n$, permutes its symbols according to the lexicographic order of suffixes of $T$. BWT is one of the most heavily studied algorithms in data compres
Externí odkaz:
http://arxiv.org/abs/1904.04228
String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set $\Gamma\subseteq [1..n]$ is a $k$-attractor for a string $S\in[1..\sigma]^n$ if and only if eve
Externí odkaz:
http://arxiv.org/abs/1803.01695
Autor:
Kempa, Dominik1 kempa@cs.jhu.edu, Kociumaka, Tomasz2 kociumaka@berkeley.edu
Publikováno v:
Communications of the ACM. Jun2022, Vol. 65 Issue 6, p91-98. 8p. 2 Diagrams, 1 Chart.