Representing Paths in Graph Database Pattern Matching

Autor: Wim Martens, Matthias Niewerth, Tina Popp, Carlos Rojas, Stijn Vansummeren, Domagoj Vrgoč
Přispěvatelé: MARTENS, Wim, Niewerth, Matthias, Popp, Tina, Rojas, Carlos, VANSUMMEREN, Stijn, Vrgoc, Domagoj
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Popis: Modern graph database query languages such as GQL, SQL/PGQ, and their academic predecessor G-Core promote paths to first-class citizens in the sense that their pattern matching facility can return paths, as opposed to only nodes and edges. This is challenging for database engines, since graphs can have a large number of paths between a given node pair, which can cause huge intermediate results in query evaluation. We introduce the concept of path multiset representations (PMRs), which can represent multisets of paths exponentially succinctly and therefore bring significant advantages for representing intermediate results. We give a detailed theoretical analysis that shows that they are especially well-suited for representing results of regular path queries and extensions thereof involving counting, random sampling, and unions. Our experiments show that they drastically improve scalability for regular path query evaluation, with speedups of several orders of magnitude. We are grateful to Matthias Hofer for valuable discussions and to Wojciech Czerwiński for pointing us to [58]. This work was supported by the ANR project EQUUS ANR-19-CE48-0019; funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 431183758. Vansummeren was supported by the Bijzonder Onderzoeksfonds (BOF) of Hasselt University (Belgium) under Grant No. BOF20ZAP02. Vrgoč and Rojas were supported by ANID – Millennium Science Initiative Program – Code ICN17_002. Vrgoč was also supported by ANID Fondecyt Regular grant nr. 1221799.
Databáze: OpenAIRE