Online Relation Alignment for Linked Datasets

Autor:	Koutraki, Maria, Preda, Nicoleta, Vodislav, Dan
Přispěvatelé:	Parallélisme, Réseaux, Systèmes, Modélisation (PRISM), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Centre National de la Recherche Scientifique (CNRS), Karlsruhe Institute of Technology (KIT), Equipes Traitement de l'Information et Systèmes (ETIS - UMR 8051), Ecole Nationale Supérieure de l'Electronique et de ses Applications (ENSEA)-Centre National de la Recherche Scientifique (CNRS)-CY Cergy Paris Université (CY), ANR-10-LABX-0094,PATRIMA,Tangible heritage(2010), Centre National de la Recherche Scientifique (CNRS)-Université de Versailles Saint-Quentin-en-Yvelines (UVSQ), Vodislav, Dan, Laboratoires d'excellence - Tangible heritage - - PATRIMA2010 - ANR-10-LABX-0094 - LABX - VALID
Jazyk:	angličtina
Rok vydání:	2017
Předmět:	[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] [INFO.INFO-WB] Computer Science [cs]/Web [INFO.INFO-WB]Computer Science [cs]/Web [INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB]
Zdroj:	The Semantic Web, ESWC 2017 The Semantic Web, ESWC 2017, May 2017, Portoroz, Slovenia
Popis:	International audience; The large number of linked datasets in the Web, and their diversity in terms of schema representation has led to a fragmented dataset landscape. Querying and addressing information needs that span across disparate datasets requires the alignment of such schemas. Majority of schema and ontology alignment approaches focus exclusively on class alignment. Yet, relation alignment has not been fully addressed, and existing approaches fall short on addressing the dynamics of datasets and their size. In this work, we address the problem of relation alignment across disparate linked datasets. Our approach focuses on two main aspects. First, online relation alignment , where we do not require full access, and sample instead for a minimal subset of the data. Thus, we address the main limitation of existing work on dealing with the large scale of linked datasets, and in cases where the datasets provide only query access. Second, we learn supervised machine learning models for which we employ various features or matchers that account for the diversity of linked datasets at the instance level. We perform an experimental evaluation on real-world linked datasets, DBpedia, YAGO, and Freebase. The results show superior performance against state-of-the-art approaches in schema matching, with an average relation alignment accuracy of 84%. In addition, we show that relation alignment can be performed efficiently at scale.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::12e881ebb5a1d0ee6be510c24a50afc1 https://hal.archives-ouvertes.fr/hal-01724199/document Zobrazit plný text záznamu