OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes

Autor: Geleta, David, Nikolov, Andriy, ODonoghue, Mark, Rozemberczki, Benedek, Gogleva, Anna, Tamma, Valentina, Payne, Terry R.
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: Duplication of nodes is a common problem encountered when building knowledge graphs (KGs) from heterogeneous datasets, where it is crucial to be able to merge nodes having the same meaning. OntoMerger is a Python ontology integration library whose functionality is to deduplicate KG nodes. Our approach takes a set of KG nodes, mappings and disconnected hierarchies and generates a set of merged nodes together with a connected hierarchy. In addition, the library provides analytic and data testing functionalities that can be used to fine-tune the inputs, further reducing duplication, and to increase connectivity of the output graph. OntoMerger can be applied to a wide variety of ontologies and KGs. In this paper we introduce OntoMerger and illustrate its functionality on a real-world biomedical KG.
Comment: Code available under: https://github.com/AstraZeneca/onto_merger
Databáze: arXiv