Large-scale design and refinement of stable proteins using sequence-only models

Autor: Devin Strickland, Alex Ford, Tamuka M. Chidyausiku, Francis C. Motta, Hamed Eramian, Gabriel J. Rocklin, Jedediah M. Singer, Longxing Cao, Ethan Ho, Anindya Roy, Gevorg Grigoryan, Scott Novotney, Asim K. Bera, Eva-Maria Strauch, Nicholas Leiby, Cameron M. Chow, Hugh K. Haddox, David Baker, Lance Stewart, Matthew W. Vaughn, Craig O. Mackenzie, Frank DiMaio, Eric Klavins
Rok vydání: 2021
Předmět:
DOI: 10.1101/2021.03.12.435185
Popis: Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we report a neural network model that predicts protein stability based only on sequences of amino acids, and demonstrate its performance by evaluating the stability of almost 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We also report a second neural network model that is able to generate novel stable proteins. Finally, we show that the predictive model can be used to substantially increase the stability of both expert-designed and model-generated proteins.
Databáze: OpenAIRE