Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
Autor: | Sanghyun Park, Jeongwoo Kim, Chihyun Park, Jung Rim Kim |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Decision Analysis Computer science Gene regulatory network Gene Identification and Analysis lcsh:Medicine Gene Expression Genetic Networks computer.software_genre Alzheimer's Disease Interactome Machine Learning 0302 clinical medicine Databases Genetic Feature (machine learning) Medicine and Health Sciences Gene Regulatory Networks lcsh:Science Multidisciplinary Applied Mathematics Simulation and Modeling Neurodegenerative Diseases Random forest Identification (information) Neurology Physical Sciences Engineering and Technology Management Engineering Algorithms Network Analysis Research Article Computer and Information Sciences Machine learning Research and Analysis Methods 03 medical and health sciences Machine Learning Algorithms Artificial Intelligence Mental Health and Psychiatry Genetics Gene business.industry Gene Expression Profiling lcsh:R Decision Trees Biology and Life Sciences Computational Biology Epistasis Genetic Decision Tree Learning Gene expression profiling 030104 developmental biology Epistasis lcsh:Q Dementia Artificial intelligence business computer 030217 neurology & neurosurgery Mathematics |
Zdroj: | PLoS ONE PLoS ONE, Vol 13, Iss 7, p e0201056 (2018) |
ISSN: | 1932-6203 |
Popis: | The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene-gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer's disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer's disease. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |