Document Based RDF Storage Method for Efficient Parallel Query Processing

Autor: Eleftherios Kalogeros, Matthew Damigos, Manolis Gergatsoulis
Rok vydání: 2019
Předmět:
Zdroj: Metadata and Semantic Research ISBN: 9783030144005
MTSR
DOI: 10.1007/978-3-030-14401-2_2
Popis: In this paper, we investigate the problem of efficiently evaluating SPARQL queries, over large amount of linked data utilizing distributed NoSQL system. We propose an efficient approach for partitioning large linked data graphs using distributed frameworks (MapReduce), as well as an effective data model for storing linked data in a document database using a maximum replication factor of 2 (i.e., in the worst case scenario, the data graph will be doubled in storage size). The model proposed and the partitioning approach ensure high-performance query evaluation and horizontal scaling for the type of queries called generalized star queries (i.e., queries allowing both subject-object and object-subject edges from a central node), due to the fact that no joining operations over multiple datasets are required to evaluate the queries. Furthermore, we present an implementation of our approach using MongoDB and an algorithm for translating generalized star queries into MongoDB query language, based on the proposed data model.
Databáze: OpenAIRE