DigChem: Identification of disease-gene-chemical relationships from Medline abstracts
Autor: | Hyunju Lee, Jung-jae Kim, Jeongkyun Kim |
---|---|
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Male Databases Factual Disease Alzheimer's Disease Manual curation Machine Learning Database and Informatics Methods 0302 clinical medicine Databases Genetic Medicine and Health Sciences Data Mining Biology (General) Database Searching Disease gene Ecology Prostate Cancer Prostate Diseases Neurodegenerative Diseases Computational Theory and Mathematics Neurology Oncology Modeling and Simulation Information Retrieval Identification (biology) Female Information Technology Sequence Analysis Chemically-Induced Disorders Research Article PubMed Computer and Information Sciences Neural Networks QH301-705.5 Abstracting and Indexing Bioinformatics MEDLINE Urology Computational biology Biology Research and Analysis Methods 03 medical and health sciences Cellular and Molecular Neuroscience Deep Learning Artificial Intelligence Word Embedding Mental Health and Psychiatry Genetics Humans Molecular Biology Ecology Evolution Behavior and Systematics Natural Language Processing Computational Biology Biology and Life Sciences Cancers and Neoplasms Search Engine Genitourinary Tract Tumors 030104 developmental biology Dementia Neural Networks Computer Sequence Alignment 030217 neurology & neurosurgery Neuroscience |
Zdroj: | PLoS Computational Biology PLoS Computational Biology, Vol 15, Iss 5, p e1007022 (2019) |
ISSN: | 1553-7358 |
Popis: | Chemicals interact with genes in the process of disease development and treatment. Although much biomedical research has been performed to understand relationships among genes, chemicals, and diseases, which have been reported in biomedical articles in Medline, there are few studies that extract disease–gene–chemical relationships from biomedical literature at a PubMed scale. In this study, we propose a deep learning model based on bidirectional long short-term memory to identify the evidence sentences of relationships among genes, chemicals, and diseases from Medline abstracts. Then, we develop the search engine DigChem to enable disease–gene–chemical relationship searches for 35,124 genes, 56,382 chemicals, and 5,675 diseases. We show that the identified relationships are reliable by comparing them with manual curation and existing databases. DigChem is available at http://gcancer.org/digchem. Author summary For understanding the role of chemicals in the molecular process of disease development and treatment, it is important to extract disease–gene–chemical relationships from the literature, that is, which gene and which chemical interact with each other for the development and treatment of which disease. Previous works extract binary relationships such as gene–chemical and disease–gene relationships from the literature and employ statistical measurements for deducing the triple relationship of disease–gene–chemical. Since the statistical measurements for inference often fail, we develop a deep learning model to identify the evidence sentences of disease–gene–chemical relationships from Medline abstracts. We show that the identified relationships are reliable, by comparing them with manually curated databases. We also provide a search engine called DigChem over the identified evidence sentences. |
Databáze: | OpenAIRE |
Externí odkaz: |