EnzDP: Improved enzyme annotation for metabolic network reconstruction based on domain composition profiles
Autor: | Sriganesh Srihari, Hon Wai Leong, Nam Ninh Nguyen, Ket Fah Chong |
---|---|
Rok vydání: | 2015 |
Předmět: |
Architecture domain
Protein domain Metabolic network Sequence alignment Biology computer.software_genre Biochemistry Machine Learning Annotation Catalytic Domain Cluster Analysis Databases Protein Hidden Markov model Cluster analysis Molecular Biology Markov chain Computational Biology Markov Chains Enzymes Protein Structure Tertiary Computer Science Applications Structural Homology Protein Data mining Sequence Alignment computer Metabolic Networks and Pathways |
Zdroj: | Journal of Bioinformatics and Computational Biology. 13:1543003 |
ISSN: | 1757-6334 0219-7200 |
DOI: | 10.1142/s0219720015430039 |
Popis: | Determining the entire complement of enzymes and their enzymatic functions is a fundamental step for reconstructing the metabolic network of cells. High quality enzyme annotation helps in enhancing metabolic networks reconstructed from the genome, especially by reducing gaps and increasing the enzyme coverage. Currently, structure-based and network-based approaches can only cover a limited number of enzyme families, and the accuracy of homology-based approaches can be further improved. Bottom-up homology-based approach improves the coverage by rebuilding Hidden Markov Model (HMM) profiles for all known enzymes. However, its clustering procedure relies firmly on BLAST similarity score, ignoring protein domains/patterns, and is sensitive to changes in cut-off thresholds. Here, we use functional domain architecture to score the association between domain families and enzyme families (Domain-Enzyme Association Scoring, DEAS). The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score. We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites. Our analysis shows that our stringent protocol EnzDP can cover up to 90% of enzyme families available in Swiss-Prot. It achieves a high accuracy of 94.5% based on five-fold cross-validation. EnzDP outperforms existing methods across several testing scenarios. Thus, EnzDP serves as a reliable automated tool for enzyme annotation and metabolic network reconstruction. Available at: www.comp.nus.edu.sg/~nguyennn/EnzDP . |
Databáze: | OpenAIRE |
Externí odkaz: |