Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy
Autor: | Tho Quan, Van T. Le, Vinh Nguyen, Sy V. Nghiem, Hong Yung Yip, Olivier Bodenreider, Tien T. T. Tran |
---|---|
Rok vydání: | 2020 |
Předmět: |
0303 health sciences
Medical terminology Computer science Synonym Umls metathesaurus Process (engineering) business.industry Deep learning Unified Medical Language System 02 engineering and technology computer.software_genre Article 03 medical and health sciences Knowledge graph 020204 information systems 0202 electrical engineering electronic engineering information engineering Knowledge sources Artificial intelligence business computer Natural language processing 030304 developmental biology |
Zdroj: | KSE Int Conf Knowl Syst Eng |
DOI: | 10.1109/kse50997.2020.9287797 |
Popis: | The Unified Medical Language System, or UMLS, is a repository of medical terminology developed by the U.S. National Library of Medicine for improving the computer system's ability of understanding the biomedical and health languages. The UMLS Metathesaurus is one of the three UMLS knowledge sources, containing medical terms and their relationships. Due to the rapid increase in the number of medical terms recently, the current construction of UMLS Metathesaurus, which heavily depends on lexical tools and human editors, is error-prone and time-consuming. This paper takes advantages of the emerging deep learning models for learning to predict the synonyms and non-synonyms between the pairs of biomedical terms in the Metathesaurus. Our learning approach focuses a subset of specific terms instead of the whole Metathesaurus corpus. Particularly, we train the models with biomedical terms from the Disorders semantic group. To strengthen the models, we enrich the inputs with different strategies, including synonyms and hierarchical relationships from source vocabularies. Our deep learning model adopts the Siamese KG-LSTM (Siamese Knowledge Graph - Long Short-Term Memory) in the architecture. The experimental results show that this approach yields excellent performance when handling the task of synonym detection for Disorders semantic group in the Metathesaurus. This shows the potential of applying machine learning techniques in the UMLS Metathesaurus construction process. Although the work in this paper focuses only on specific semantic group of Disorders, we believe that the proposed method can be applied to other semantic groups in the UMLS Metathesaurus. |
Databáze: | OpenAIRE |
Externí odkaz: |