Zobrazeno 1 - 10
of 78
pro vyhledávání: '"Nishimoto, Takaaki"'
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Big data, encompassing extensive datasets, has seen rapid expansion, notably with a considerable portion being textual data, including strings and texts. Simple compression methods and standard data structures prove inadequate for processing these da
Externí odkaz:
http://arxiv.org/abs/2404.07510
Autor:
Bannai, Hideo, Goto, Keisuke, Ishihata, Masakazu, Kanda, Shunsuke, Köppl, Dominik, Nishimoto, Takaaki
Repetitiveness measures reveal profound characteristics of datasets, and give rise to compressed data structures and algorithms working in compressed space. Alas, the computation of some of these measures is NP-hard, and straight-forward computation
Externí odkaz:
http://arxiv.org/abs/2207.02571
The compression of highly repetitive strings (i.e., strings with many repetitions) has been a central research topic in string processing, and quite a few compression methods for these strings have been proposed thus far. Among them, an efficient com
Externí odkaz:
http://arxiv.org/abs/2202.07885
Autor:
Bannai, Hideo, Funakoshi, Mitsuru, I, Tomohiro, Koeppl, Dominik, Mieno, Takuya, Nishimoto, Takaaki
We prove that for $n\geq 2$, the size $b(t_n)$ of the smallest bidirectional scheme for the $n$th Thue--Morse word $t_n$ is $n+2$. Since Kutsukake et al. [SPIRE 2020] show that the size $\gamma(t_n)$ of the smallest string attractor for $t_n$ is $4$
Externí odkaz:
http://arxiv.org/abs/2104.09985
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Indexing highly repetitive strings (i.e., strings with many repetitions) for fast queries has become a central research topic in string processing, because it has a wide variety of applications in bioinformatics and natural language processing. Altho
Externí odkaz:
http://arxiv.org/abs/2006.05104
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Enumerating characteristic substrings (e.g., maximal repeats, minimal unique substrings, and minimal absent words) in a given string has been an important research topic because there are a wide variety of applications in various areas such as string
Externí odkaz:
http://arxiv.org/abs/2004.01493
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Converting a compressed format of a string into another compressed format without an explicit decompression is one of the central research topics in string processing. We discuss the problem of converting the run-length Burrows-Wheeler Transform (RLB
Externí odkaz:
http://arxiv.org/abs/1902.05224
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Lossless data compression has been widely studied in computer science. One of the most widely used lossless data compressions is Lempel-Zip(LZ) 77 parsing, which achieves a high compression ratio. Bidirectional (a.k.a. macro) parsing is a lossless da
Externí odkaz:
http://arxiv.org/abs/1812.04261
Autor:
Nishimoto, Takaaki, Tabei, Yasuo
Publikováno v:
In Information and Computation May 2022 285 Part B
We present a novel compressed dynamic self-index for highly repetitive text collections. Signature encoding is a compressed dynamic self-index for highly repetitive texts and has a large disadvantage that the pattern search for short patterns is slow
Externí odkaz:
http://arxiv.org/abs/1711.02855