Speeding up RDF aggregate discovery through sampling

Autor: Manolescu, Ioana, Mazuran, Mirjana
Přispěvatelé: Rich Data Analytics at Cloud Scale (CEDAR), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), ∗M. Mazuran is supported by the H2020 research program under grant., École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: BigVis 2019-2nd International Workshop on Big Data Visual Exploration and Analytics
BigVis 2019-2nd International Workshop on Big Data Visual Exploration and Analytics, Mar 2019, Lisbon, Portugal
Popis: International audience; RDF graphs can be large and complex; finding out interesting information within them is challenging. One easy method for users to discover such graphs is to be shown interesting aggregates (un-der the form of two-dimensional graphs, i.e., bar charts), where interestingness is evaluated through statistics criteria. Dagger [5] pioneered this approach, however its is quite inefficient, in particular due to the need to evaluate numerous, expensive aggregation queries. In this work, we describe Dagger + , which builds upon Dagger and leverages sampling to speed up the evaluation of potentially interesting aggregates. We show that Dagger + achieves very significant execution time reductions, while reaching results very close to those of the original, less efficient system.
Databáze: OpenAIRE