Mapping layperson medical terminology into the Human Phenotype Ontology using neural machine translation models

Autor: Enrico Manzini, Jon Garrido-Aguirre, Jordi Fonollosa, Alexandre Perera-Lluna
Přispěvatelé: Universitat Politècnica de Catalunya. Doctorat en Bioinformàtica, Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya. B2SLab - Bioinformatics and Biomedical Signals Laboratory
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
DOI: 10.1016/j.eswa.2022.117446
Popis: Supplementary material related to this article can be found online at https://doi.org/10.1016/j.eswa.2022.117446. In the medical domain there exists a terminological gap between patients and caregivers and the healthcare professionals. This gap may hinder the success of the communication between healthcare consumers and professionals in the field, with negative emotional and clinical consequences. In this work, we build a machine learning-based tool for the automatic translation between the terminology used by laypeople and that of the Human Phenotype Ontology (HPO). HPO is a structured vocabulary of phenotypic abnormalities found in human disease. Our method uses a vector space to represent an HPO-specific embedding as the output space for a neural network model trained on vector representations of layperson versions and other textual descriptors of medical terms. We explored different output embeddings coupled to different neural network architectures for the machine translation stage. We compute a similarity measure to evaluate the ability of the model to assign an HPO term to a layperson input. The best-performing models resulted with a similarity higher than 0.7 for more than 80% of the terms, with a median between 0.98 and 1. The translator model is made available in a web application at this link: https://hpotranslator.b2slab.upc.edu. This work was supported by the Spanish Ministry of Economy and Competitiveness (www.mineco.gob.es) TEC2014-60337-R, DPI2017-89827-R, Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), initiatives of Instituto de Investigación Carlos III (ISCIII), and Share4Rare project (Grant Agreement 780262). This work was partially funded by ACCIÓ (Innotec ACE014/20/000018). B2SLab is certified as 2017 SGR 952. The authors thank the NVIDIA Corporation for the donation of a Titan Xp GPU used to run the models presented in this article. J. Fonollosa acknowledges the support from the Serra Húnter program.
Databáze: OpenAIRE