Streaming partitioning of RDF graphs for datalog reasoning

Autor: Boris Motik, Temitope Ajileye, Ian Horrocks
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: The Semantic Web ISBN: 9783030773847
ESWC
DOI: 10.1007/978-3-030-77385-4_1
Popis: A cluster of servers is often used to reason over RDF graphs whose size exceeds the capacity of a single server. While many distributed approaches to reasoning have been proposed, the problem of data partitioning has received little attention thus far. In practice, data is usually partitioned by a variant of hashing, which is very simple, but it does not pay attention to data locality. Locality-aware partitioning approaches have been considered, but they usually process the entire dataset on a single server. In this paper, we present two new RDF partitioning strategies. Both are inspired by recent streaming graph partitioning algorithms, which partition a graph while keeping only a small subset of the graph in memory. We have evaluated our approaches empirically against hash and min-cut partitioning. Our results suggest that our approaches can significantly improve reasoning performance, but without unrealistic demands on the memory of the servers used for partitioning.
Databáze: OpenAIRE