Learning probabilistic relational models with (partially structured) graph databases

Autor: Marwa El Abri, Philippe Leray, Nadia Essoussi
Přispěvatelé: Laboratoire des Sciences du Numérique de Nantes (LS2N), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Recherche Opérationnelle de Décision et de Contrôle de Processus (LARODEC), Université de Tunis-ISG de Tunis, Data User Knowledge (DUKe), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), PILGRIM
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Zdroj: 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA)
14th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2017)
14th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2017), 2017, Hammamet, Tunisia. ⟨10.1109/AICCSA.2017.39⟩
AICCSA
DOI: 10.1109/AICCSA.2017.39⟩
Popis: International audience; Probabilistic Relational Models (PRMs) such as Directed Acyclic Probabilistic Entity Relationship (DAPER) models are probabilistic models dealing with knowledge representation and relational data. Existing literature dealing with PRM and DAPER relies on well structured relational databases. In contrast, a large portion of real-world data is stored in Nosql databases specially graph databases that do not depend on a rigid schema. This paper builds on the recent work on DAPER models, and describes how to learn them from partially structured graph databases. Our contribution is twofold. First, we present how to extract the underlying ER model from a partially structured graph database. Then, we describe a method to compute sufficient statistics based on graph traversal techniques. Our objective is also twofold: we want to learn DAPERs with less structured data, and we want to accelerate the learning process by querying graph databases. Our experiments show that both objectives are completed, transforming the structure learning process into a more feasible task even when data are less structured than an usual relational database.
Databáze: OpenAIRE