A novel essential protein identification method based on PPI networks and gene expression data
Autor: | Qiang Tang, Yusui Sun, Jiancheng Zhong, Jiahong Yang, Wei Peng, Minzhu Xie, Qiu Xiao, Chao Tang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Edge clustering coefficient
Saccharomyces cerevisiae Proteins Jaccard index Computer science QH301-705.5 0206 medical engineering Computer applications to medicine. Medical informatics R858-859.7 The PPI networks 02 engineering and technology Computational biology Biochemistry Jaccard similarity index 03 medical and health sciences Similarity (network science) Structural Biology Protein Interaction Mapping Gene expression Protein Interaction Maps Biology (General) Molecular Biology 030304 developmental biology 0303 health sciences Applied Mathematics Computational Biology Computer Science Applications ROC Curve Ppi network Benchmark (computing) Protein identification Essential proteins DNA microarray Transcriptome Centrality Algorithms 020602 bioinformatics Research Article |
Zdroj: | BMC Bioinformatics, Vol 22, Iss 1, Pp 1-21 (2021) BMC Bioinformatics |
ISSN: | 1471-2105 |
Popis: | Background Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. Results In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. Conclusions We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |