Document Based RDF Storage Method for Efficient Parallel Query Processing
Autor: | Eleftherios Kalogeros, Matthew Damigos, Manolis Gergatsoulis |
---|---|
Rok vydání: | 2019 |
Předmět: |
Theoretical computer science
Computer science 020207 software engineering 02 engineering and technology Linked data computer.file_format Star (graph theory) Query language NoSQL computer.software_genre Replication (computing) Data model 020204 information systems 0202 electrical engineering electronic engineering information engineering SPARQL RDF computer |
Zdroj: | Metadata and Semantic Research ISBN: 9783030144005 MTSR |
DOI: | 10.1007/978-3-030-14401-2_2 |
Popis: | In this paper, we investigate the problem of efficiently evaluating SPARQL queries, over large amount of linked data utilizing distributed NoSQL system. We propose an efficient approach for partitioning large linked data graphs using distributed frameworks (MapReduce), as well as an effective data model for storing linked data in a document database using a maximum replication factor of 2 (i.e., in the worst case scenario, the data graph will be doubled in storage size). The model proposed and the partitioning approach ensure high-performance query evaluation and horizontal scaling for the type of queries called generalized star queries (i.e., queries allowing both subject-object and object-subject edges from a central node), due to the fact that no joining operations over multiple datasets are required to evaluate the queries. Furthermore, we present an implementation of our approach using MongoDB and an algorithm for translating generalized star queries into MongoDB query language, based on the proposed data model. |
Databáze: | OpenAIRE |
Externí odkaz: |