Speeding up RDF aggregate discovery through sampling

Autor:	Manolescu, Ioana, Mazuran, Mirjana
Přispěvatelé:	Rich Data Analytics at Cloud Scale (CEDAR), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), ∗M. Mazuran is supported by the H2020 research program under grant., École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] [INFO]Computer Science [cs]
Zdroj:	BigVis 2019-2nd International Workshop on Big Data Visual Exploration and Analytics BigVis 2019-2nd International Workshop on Big Data Visual Exploration and Analytics, Mar 2019, Lisbon, Portugal
Popis:	International audience; RDF graphs can be large and complex; finding out interesting information within them is challenging. One easy method for users to discover such graphs is to be shown interesting aggregates (un-der the form of two-dimensional graphs, i.e., bar charts), where interestingness is evaluated through statistics criteria. Dagger [5] pioneered this approach, however its is quite inefficient, in particular due to the need to evaluate numerous, expensive aggregation queries. In this work, we describe Dagger + , which builds upon Dagger and leverages sampling to speed up the evaluation of potentially interesting aggregates. We show that Dagger + achieves very significant execution time reductions, while reaching results very close to those of the original, less efficient system.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::3e76ecfeaf9afe6f86c250fe5e56922a https://hal.inria.fr/hal-02065993/document Zobrazit plný text záznamu