Zobrazeno 1 - 10
of 1 830
pro vyhledávání: '"Pissis, A."'
Time series are ubiquitous in domains ranging from medicine to marketing and finance. Frequent Pattern Mining (FPM) from a time series has thus received much attention. Recently, it has been studied under the order-preserving (OP) matching relation s
Externí odkaz:
http://arxiv.org/abs/2411.19511
Autor:
Gabory, Esteban, Mwaniki, Moses Njagi, Pisanti, Nadia, Pissis, Solon P., Radoszewski, Jakub, Sweering, Michelle, Zuba, Wiktor
An elastic-degenerate (ED) string $T$ is a sequence of $n$ sets $T[1],\ldots,T[n]$ containing $m$ strings in total whose cumulative length is $N$. We call $n$, $m$, and $N$ the length, the cardinality and the size of $T$, respectively. The language o
Externí odkaz:
http://arxiv.org/abs/2411.07782
Autor:
Pissis, Solon P.
We revisit the classic border tree data structure [Gu, Farach, Beigel, SODA 1994] that answers the following prefix-suffix queries on a string $T$ of length $n$ over an integer alphabet $\Sigma=[0,\sigma)$: for any $i,j \in [0,n)$ return all occurren
Externí odkaz:
http://arxiv.org/abs/2411.03784
In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such
Externí odkaz:
http://arxiv.org/abs/2407.11819
Minimizers sampling is one of the most widely-used mechanisms for sampling strings [Roberts et al., Bioinformatics 2004]. Let $S=S[1]\ldots S[n]$ be a string over a totally ordered alphabet $\Sigma$. Further let $w\geq 2$ and $k\geq 1$ be two integer
Externí odkaz:
http://arxiv.org/abs/2405.04052
Strings in the real world are often encoded with some level of uncertainty. In the character-level uncertainty model, an uncertain string $X$ of length $n$ on an alphabet $\Sigma$ is a sequence of $n$ probability distributions over $\Sigma$. Given an
Externí odkaz:
http://arxiv.org/abs/2403.14256
Autor:
Charalampopoulos, Panagiotis, Pissis, Solon P., Radoszewski, Jakub, Rytter, Wojciech, Waleń, Tomasz, Zuba, Wiktor
In the $k$-Edit Circular Pattern Matching ($k$-Edit CPM) problem, we are given a length-$n$ text $T$, a length-$m$ pattern $P$, and a positive integer threshold $k$, and we are to report all starting positions of the substrings of $T$ that are at edi
Externí odkaz:
http://arxiv.org/abs/2402.14550
The weighted ancestor problem on a rooted node-weighted tree $T$ is a generalization of the classic predecessor problem: construct a data structure for a set of integers that supports fast predecessor queries. Both problems are known to require $\Ome
Externí odkaz:
http://arxiv.org/abs/2311.15777
Sparse suffix sorting is the problem of sorting $b=o(n)$ suffixes of a string of length $n$. Efficient sparse suffix sorting algorithms have existed for more than a decade. Despite the multitude of works and their justified claims for applications in
Externí odkaz:
http://arxiv.org/abs/2310.09023
Autor:
Bille, Philip, Gørtz, Inge Li, Lewenstein, Moshe, Pissis, Solon P., Rotenberg, Eva, Steiner, Teresa Anna
In Gapped String Indexing, the goal is to compactly represent a string $S$ of length $n$ such that for any query consisting of two strings $P_1$ and $P_2$, called patterns, and an integer interval $[\alpha, \beta]$, called gap range, we can quickly f
Externí odkaz:
http://arxiv.org/abs/2211.16860