Efficient Row Pattern Matching Using Pattern Hierarchies for Sequence OLAP
Autor: | Yuya Nasu, Hiroyuki Kitagawa, Kosuke Nakabasami |
---|---|
Rok vydání: | 2019 |
Předmět: | |
Zdroj: | Big Data Analytics and Knowledge Discovery ISBN: 9783030275198 DaWaK |
DOI: | 10.1007/978-3-030-27520-4_7 |
Popis: | Sequence OLAP is a variant of OLAP for sequence data analysis such as analysis of RFID log and person trip data. It extracts pattern occurrences of the given patterns (e.g., state transition pattern \(S_1 \rightarrow S_2\), movement pattern \(A \rightarrow B\)) on sequence data and executes multi-dimensional aggregate using OLAP operations (such as drill-down and roll-up) and pattern OLAP operations (such as pattern-drill-down and pattern-roll-up). The pattern OLAP operations are specific to Sequence OLAP and involve a hierarchy of multiple patterns. When sequence data is stored in relational databases as sequences of rows, row pattern matching finds all subsequences of rows which match a given pattern. To do Sequence OLAP, especially pattern OLAP operations, on relational databases, it is required to execute row pattern matching for such a hierarchy of multiple patterns and identify parent-child relationships among pattern occurrences. Generally, row pattern matching needs sequential scan of a large table and is an expensive operation. If row pattern matching is executed individually for each pattern, it is very time consuming. Therefore, it is strongly demanded to execute multiple row pattern matching for a given hierarchy of patterns efficiently. This paper formalizes a pattern hierarchy model for Sequence OLAP and proposes a very efficient algorithm to do multiple row pattern matching using SP-NFA (Shared Prefix Nondeterministic Finite Automaton). In experiments, we implement our algorithm in PostgreSQL and evaluate the effectiveness of the proposal. |
Databáze: | OpenAIRE |
Externí odkaz: |