Machine Learning to Support the Presentation of Complex Pathway Graphs

Autor: Fintan McGee, Sune Steinbjorn Nielsen, Simone Zorzan, David Hoksza, Marek Ostaszewski
Rok vydání: 2021
Předmět:
Zdroj: IEEE/ACM Transactions on Computational Biology and Bioinformatics. 18:1130-1141
ISSN: 2374-0043
1545-5963
Popis: Visualization of biological mechanisms by means of pathway graphs is necessary to better understand the often complex underlying system. Manual layout of such pathways or maps of knowledge is a difficult and time consuming process. Node duplication is a technique that makes layouts with improved readability possible by reducing edge crossings and shortening edge lengths in drawn diagrams. In this article, we propose an approach using Machine Learning (ML) to facilitate parts of this task by training a Support Vector Machine (SVM) with actions taken during manual biocuration. Our training input is a series of incremental snapshots of a diagram describing mechanisms of a disease, progressively curated by a human expert employing node duplication in the process. As a test of the trained SVM models, they are applied to a single large instance and 25 medium-sized instances of hand-curated biological pathways. Finally, in a user validation study, we compare the model predictions to the outcome of a node duplication questionnaire answered by users of biological pathways with varying experience. We successfully predicted nodes for duplication and emulated human choices, demonstrating that our approach can effectively learn human-like node duplication preferences to support curation of pathway diagrams in various contexts.
Databáze: OpenAIRE