A Distributed Query Method for RDF Data on Spark
Autor: | Jingbin Wang, Minru Guo |
---|---|
Rok vydání: | 2016 |
Předmět: |
Indexed file
Information retrieval Database Computer science RDF Schema InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL InformationSystems_DATABASEMANAGEMENT 02 engineering and technology computer.file_format Semantic data model computer.software_genre Predicate (grammar) 020204 information systems 0202 electrical engineering electronic engineering information engineering SPARQL 020201 artificial intelligence & image processing Cache RDF computer RDF query language computer.programming_language |
Zdroj: | Communications in Computer and Information Science ISBN: 9789811004568 |
Popis: | With the upcoming data deluge of semantic data, the fast growth of RDF data has brought significant challenges in query. A new distributed RDF query algorithm RQCCP (RDF data Query combined with Classes Correlations with Property) on Spark platform is proposed to solve the problem of low efficiency for RDF data query. It splits and stores RDF data by the class of Subject, Predicate and the class of Object, simultaneously building index file of classes correlations with property; the index is applied to narrow the scope of input for query, filtering out irrelevant triples in advance and intermediate results of query cached in memory as resilient distributed dataset to reduce disk and network I/O. The results of experiments conducted on large-scale RDF datasets show that RQCCP has high query performance. |
Databáze: | OpenAIRE |
Externí odkaz: |