Role of simple descriptors and applicability domain in predicting change in protein thermostability.

Autor: McGuinness KN; Modeling and Informatics, Merck & Co., Inc., Kenilworth, New Jersey, United States of America., Pan W; Biochemical Engineering and Structure, Merck & Co., Inc., Rahway, New Jersey, United States of America., Sheridan RP; Modeling and Informatics, Merck & Co., Inc., Kenilworth, New Jersey, United States of America., Murphy G; Biochemical Engineering and Structure, Merck & Co., Inc., Rahway, New Jersey, United States of America., Crespo A; Modeling and Informatics, Merck & Co., Inc., Kenilworth, New Jersey, United States of America.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2018 Sep 07; Vol. 13 (9), pp. e0203819. Date of Electronic Publication: 2018 Sep 07 (Print Publication: 2018).
DOI: 10.1371/journal.pone.0203819
Abstrakt: The melting temperature (Tm) of a protein is the temperature at which half of the protein population is in a folded state. Therefore, Tm is a measure of the thermostability of a protein. Increasing the Tm of a protein is a critical goal in biotechnology and biomedicine. However, predicting the change in melting temperature (dTm) due to mutations at a single residue is difficult because it depends on an intricate balance of forces. Existing methods for predicting dTm have had similar levels of success using generally complex models. We find that training a machine learning model with a simple set of easy to calculate physicochemical descriptors describing the local environment of the mutation performed as well as more complicated machine learning models and is 2-6 orders of magnitude faster. Importantly, unlike in most previous publications, we perform a blind prospective test on our simple model by designing 96 variants of a protein not in the training set. Results from retrospective and prospective predictions reveal the limited applicability domain of each model. This study highlights the current deficiencies in the available dTm dataset and is a call to the community to systematically design a larger and more diverse experimental dataset of mutants to prospectively predict dTm with greater certainty.
Competing Interests: All authors were employed by Merck & Co., Inc. while conducting the research presented and preparing the article for publication. This does not alter the authors adherence to PLOS ONE policies on sharing data and materials.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje