Mapping layperson medical terminology into the Human Phenotype Ontology using neural machine translation models

Autor:	Enrico Manzini, Jon Garrido-Aguirre, Jordi Fonollosa, Alexandre Perera-Lluna
Přispěvatelé:	Universitat Politècnica de Catalunya. Doctorat en Bioinformàtica, Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya. B2SLab - Bioinformatics and Biomedical Signals Laboratory
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	General Engineering Word embedding Deep learning Human Phenotype Ontology Deep phenotyping Computer Science Applications Artificial Intelligence Natural language processing (Computer science) Medical informatics Machine learning Traducció automàtica Aprenentatge automàtic Informàtica::Intel·ligència artificial [Àrees temàtiques de la UPC] Machine translation Tractament del llenguatge natural (Informàtica) Machine translating
Zdroj:	UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC)
DOI:	10.1016/j.eswa.2022.117446
Popis:	Supplementary material related to this article can be found online at https://doi.org/10.1016/j.eswa.2022.117446. In the medical domain there exists a terminological gap between patients and caregivers and the healthcare professionals. This gap may hinder the success of the communication between healthcare consumers and professionals in the field, with negative emotional and clinical consequences. In this work, we build a machine learning-based tool for the automatic translation between the terminology used by laypeople and that of the Human Phenotype Ontology (HPO). HPO is a structured vocabulary of phenotypic abnormalities found in human disease. Our method uses a vector space to represent an HPO-specific embedding as the output space for a neural network model trained on vector representations of layperson versions and other textual descriptors of medical terms. We explored different output embeddings coupled to different neural network architectures for the machine translation stage. We compute a similarity measure to evaluate the ability of the model to assign an HPO term to a layperson input. The best-performing models resulted with a similarity higher than 0.7 for more than 80% of the terms, with a median between 0.98 and 1. The translator model is made available in a web application at this link: https://hpotranslator.b2slab.upc.edu. This work was supported by the Spanish Ministry of Economy and Competitiveness (www.mineco.gob.es) TEC2014-60337-R, DPI2017-89827-R, Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), initiatives of Instituto de Investigación Carlos III (ISCIII), and Share4Rare project (Grant Agreement 780262). This work was partially funded by ACCIÓ (Innotec ACE014/20/000018). B2SLab is certified as 2017 SGR 952. The authors thank the NVIDIA Corporation for the donation of a Titan Xp GPU used to run the models presented in this article. J. Fonollosa acknowledges the support from the Serra Húnter program.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3cf9a0cc1fe0060de4b7e0ff3f40d4af https://dl.acm.org/doi/abs/10.1016/j.eswa.2022.117446 Zobrazit plný text záznamu