Automated protein motif generation in the structure-based protein function prediction tool ProMOL
Autor: | Cameron Baker, Paul Craig, Jeffrey Mills, Herbert J. Bernstein, Mitchell Lambrecht, Shariq Madha, Mikhail Osipovitch |
---|---|
Rok vydání: | 2015 |
Předmět: |
Models
Molecular Protein Conformation Computer science Amino Acid Motifs Molecular Sequence Data Protein Data Bank (RCSB PDB) computer.software_genre Biochemistry Article Molecular graphics Structure-Activity Relationship Structural bioinformatics Structural Biology Catalytic Domain Genetics Protein function prediction Amino Acid Sequence Databases Protein Structural motif business.industry Computational Biology Proteins Pattern recognition General Medicine Automation ComputingMethodologies_PATTERNRECOGNITION Template Data mining Artificial intelligence Motif (music) business Sequence Alignment computer Algorithms Software |
Zdroj: | Journal of Structural and Functional Genomics. 16:101-111 |
ISSN: | 1570-0267 1345-711X |
Popis: | ProMOL, a plugin for the PyMOL molecular graphics system, is a structure-based protein function prediction tool. ProMOL includes a set of routines for building motif templates that are used for screening query structures for enzyme active sites. Previously, each motif template was generated manually and required supervision in the optimization of parameters for sensitivity and selectivity. We developed an algorithm and workflow for the automation of motif building and testing routines in ProMOL. The algorithm uses a set of empirically derived parameters for optimization and requires little user intervention. The automated motif generation algorithm was first tested in a performance comparison with a set of manually generated motifs based on identical active sites from the same 112 PDB entries. The two sets of motifs were equally effective in identifying alignments with homologs and in rejecting alignments with unrelated structures. A second set of 296 active site motifs were generated automatically, based on Catalytic Site Atlas entries with literature citations, as an expansion of the library of existing manually generated motif templates. The new motif templates exhibited comparable performance to the existing ones in terms of hit rates against native structures, homologs with the same EC and Pfam designations, and randomly selected unrelated structures with a different EC designation at the first EC digit, as well as in terms of RMSD values obtained from local structural alignments of motifs and query structures. This research is supported by NIH grant GM078077. |
Databáze: | OpenAIRE |
Externí odkaz: |