A Distributed Query Method for RDF Data on Spark

Autor: Jingbin Wang, Minru Guo
Rok vydání: 2016
Předmět:
Zdroj: Communications in Computer and Information Science ISBN: 9789811004568
Popis: With the upcoming data deluge of semantic data, the fast growth of RDF data has brought significant challenges in query. A new distributed RDF query algorithm RQCCP (RDF data Query combined with Classes Correlations with Property) on Spark platform is proposed to solve the problem of low efficiency for RDF data query. It splits and stores RDF data by the class of Subject, Predicate and the class of Object, simultaneously building index file of classes correlations with property; the index is applied to narrow the scope of input for query, filtering out irrelevant triples in advance and intermediate results of query cached in memory as resilient distributed dataset to reduce disk and network I/O. The results of experiments conducted on large-scale RDF datasets show that RQCCP has high query performance.
Databáze: OpenAIRE