AlphaFold-predicted protein structures and small-angle X-ray scattering: insights from an extended examination of selected data in the Small-Angle Scattering Biological Data Bank.

Autor: Brookes E; Department of Chemistry and Biochemistry, University of Montana, 32 Campus Drive, Missoula, MT 59812, USA., Rocco M; Proteomica e Spettrometria di Massa, IRCCS Ospedale Policlinico San Martino, Largo R. Benzi 10, Genova 16132, Italy., Vachette P; Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette 91198, France., Trewhella J; School of Life and Environmental Sciences, The University of Sydney, NSW 2006, Australia.
Jazyk: angličtina
Zdroj: Journal of applied crystallography [J Appl Crystallogr] 2023 Jul 20; Vol. 56 (Pt 4), pp. 910-926. Date of Electronic Publication: 2023 Jul 20 (Print Publication: 2023).
DOI: 10.1107/S1600576723005344
Abstrakt: By providing predicted protein structures from nearly all known protein sequences, the artificial intelligence program AlphaFold (AF) is having a major impact on structural biology. While a stunning accuracy has been achieved for many folding units, predicted unstructured regions and the arrangement of potentially flexible linkers connecting structured domains present challenges. Focusing on single-chain structures without prosthetic groups, an earlier comparison of features derived from small-angle X-ray scattering (SAXS) data taken from the Small-Angle Scattering Biological Data Bank (SASBDB) is extended to those calculated using the corresponding AF-predicted structures. Selected SASBDB entries were carefully examined to ensure that they represented data from monodisperse protein solutions and had sufficient statistical precision and q resolution for reliable structural evaluation. Three examples were identified where there is clear evidence that the single AF-predicted structure cannot account for the experimental SAXS data. Instead, excellent agreement is found with ensemble models generated by allowing for flexible linkers between high-confidence predicted structured domains. A pool of representative structures was generated using a Monte Carlo method that adjusts backbone dihedral allowed angles along potentially flexible regions. A fast ensemble modelling method was employed that optimizes the fit of pair distance distribution functions [ P ( r ) versus r ] and intensity profiles [ I ( q ) versus q ] computed from the pool to their experimental counterparts. These results highlight the complementarity between AF prediction, solution SAXS and molecular dynamics/conformational sampling for structural modelling of proteins having both structured and flexible regions.
(© Emre Brookes et al. 2023.)
Databáze: MEDLINE