CEGSO: Boosting Essential Proteins Prediction by Integrating Protein Complex, Gene Expression, Gene Ontology, Subcellular Localization and Orthology Information
Autor: | Guanghui Li, Xiaoli Xue, Junhong Liu, Li Yuanyuan, Chengwang Xie, Wei Zhang, Hailin Chen |
---|---|
Rok vydání: | 2020 |
Předmět: |
Boosting (machine learning)
Computer science Intracellular Space Gene Expression Health Informatics Computational biology computer.software_genre Network topology General Biochemistry Genetics and Molecular Biology 03 medical and health sciences Gene expression Protein Interaction Maps 030304 developmental biology Biological Phenomena 0303 health sciences Gene ontology 030302 biochemistry & molecular biology Computational Biology Proteins Subcellular localization Computer Science Applications Gene Ontology Ppi network Benchmark (computing) Transcriptome computer Algorithms Data integration Protein Binding |
Zdroj: | Interdisciplinary sciences, computational life sciences. 13(3) |
ISSN: | 1867-1462 |
Popis: | Essential proteins are assumed to be an indispensable element in sustaining normal physiological function and crucial to drug design and disease diagnosis. The discovery of essential proteins is of great importance in revealing the molecular mechanisms and biological processes. Owing to the tedious biological experiment, many numerical methods have been developed to discover key proteins by mining the features of the high throughput data. Appropriate integration of differential biological information based on protein–protein interaction (PPI) network has been proven useful in predicting essential proteins. The main intention of this research is to provide a comprehensive study and a review on identifying essential proteins by integrating multi-source data and provide guidance for researchers. Detailed analysis and comparison of current essential protein prediction algorithms have been carried out and tested on benchmark PPI networks. In addition, based on the previous method TEGS (short for the network Topology, gene Expression, Gene ontology, and Subcellular localization), we improve the performance of predicting essential proteins by incorporating known protein complex information, the gene expression profile, Gene Ontology (GO) terms information, subcellular localization information, and protein’s orthology data into the PPI network, named CEGSO. The simulation results show that CEGSO achieves more accurate and robust results than other compared methods under different test datasets with various evaluation measurements. |
Databáze: | OpenAIRE |
Externí odkaz: |