High-throughput identification of interacting protein-protein binding sites

Autor: Philip E. Bourne, Jo-Lan Chung, Wei Wang
Jazyk: angličtina
Předmět:
Molecular Sequence Data
Protein Data Bank (RCSB PDB)
Sequence (biology)
Computational biology
Plasma protein binding
Biology
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Pattern Recognition
Automated

03 medical and health sciences
Artificial Intelligence
Sequence Analysis
Protein

Structural Biology
Protein Interaction Mapping
Computer Simulation
Amino Acid Sequence
Amino Acids
Binding site
lcsh:QH301-705.5
Throughput (business)
Molecular Biology
030304 developmental biology
0303 health sciences
Binding Sites
Applied Mathematics
030302 biochemistry & molecular biology
Molecular biology
Computer Science Applications
DNA binding site
Identification (information)
Models
Chemical

lcsh:Biology (General)
lcsh:R858-859.7
DNA microarray
Algorithms
Research Article
Protein Binding
Zdroj: BMC Bioinformatics, Vol 8, Iss 1, p 223 (2007)
BMC Bioinformatics
ISSN: 1471-2105
DOI: 10.1186/1471-2105-8-223
Popis: Background With the advent of increasing sequence and structural data, a number of methods have been proposed to locate putative protein binding sites from protein surfaces. Therefore, methods that are able to identify whether these binding sites interact are needed. Results We have developed a new method using a machine learning approach to detect if protein binding sites, once identified, interact with each other. The method exploits information relating to sequence and structural complementary across protein interfaces and has been tested on a non-redundant data set consisting of 584 homo-dimers and 198 hetero-dimers extracted from the PDB. Results indicate 87.4% of the interacting binding sites and 68.6% non-interacting binding sites were correctly identified. Furthermore, we built a pipeline that links this method to a modified version of our previously developed method that predicts the location of binding sites. Conclusion We have demonstrated that this high-throughput pipeline is capable of identifying binding sites for proteins, their interacting binding sites and, ultimately, their binding partners on a large scale.
Databáze: OpenAIRE