Predicting missing and noisy links via neighbourhood preserving graph embeddings in a clinical knowlegebase

Autor: Shameek Ghosh, Budhaditya Saha, Vedant Mandloi
Rok vydání: 2020
Předmět:
Zdroj: ICMLA
DOI: 10.1109/icmla51294.2020.00181
Popis: Clinical knowledge graphs (KG) are often incomplete in their early stages of development, prone to human error, and have limited quality control. Due to continuous human updates to these clinical large-scale graphs, data errors are very high and costly. These errors originate in the form of biases, experiences, and assumptions leading to noise. Hence, enterprise-grade clinical knowledge graphs deployed in practical settings must be evaluated to remove errors and assist human maintainers to enrich the KG using predictive techniques. In this study, we propose a novel application of a neighborhood- based relationship entity embedding method to simultaneously predict both noisy and missing links between the disease and symptom entities in a clinical knowledge graph. The relationship or link embedding method is a derivation of a knowledge graph embedding framework. Our evaluations and experiments demonstrate that the proposed methodology achieves substantial improvements in comparison to state-of-the-art baseline methods in clinical KG link prediction. In practical settings, the clinical missing link and noise correction method has reduced the manual data review process for clinicians and also predicted potential relationships that existed between a disease and a symptom in the clinical KG.
Databáze: OpenAIRE