Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition

Autor: Marcel Schmidt am Busch, Audrey Sedano, Thomas Simonson
Přispěvatelé: Laboratoire de Biochimie de l'Ecole polytechnique (BIOC), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2010
Předmět:
Models
Molecular

MESH: Amino Acids
MESH: Sequence Analysis
Protein

Entropy
Biophysics/Protein Folding
MESH: Protein Structure
Secondary

Computational Biology/Macromolecular Structure Analysis
PDZ Domains
Protein Structure
Secondary

Protein sequencing
Protein structure
Computational Biology/Protein Homology Detection
Sequence Analysis
Protein

MESH: PDZ Domains
MESH: Proteins
Amino Acids
Hidden Markov model
Databases
Protein

MESH: Structural Homology
Protein

Genetics
Multidisciplinary
Protein Stability
MESH: Entropy
MESH: Reproducibility of Results
Medicine
MESH: Models
Molecular

MESH: Computational Biology
Research Article
MESH: Databases
Protein

Biophysics/Theory and Simulation
MESH: Mutation
Science
Protein domain
Protein design
PDZ domain
Sequence alignment
Computational biology
Biology
MESH: Protein Stability
Position-Specific Scoring Matrices
[SDV.BBM]Life Sciences [q-bio]/Biochemistry
Molecular Biology

Biophysics/Structural Genomics
Computational Biology
Proteins
Reproducibility of Results
MESH: Position-Specific Scoring Matrices
Structural Homology
Protein

Mutation
Zdroj: PLoS ONE, Vol 5, Iss 5, p e10410 (2010)
PLoS ONE
PLoS ONE, Public Library of Science, 2010, 5 (5), pp.e10410. ⟨10.1371/journal.pone.0010410⟩
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0010410⟩
Popis: International audience; BACKGROUND: Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. METHODOLOGY/PRINCIPAL FINDINGS: WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. CONCLUSIONS/SIGNIFICANCE: For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.
Databáze: OpenAIRE