Accurate prediction of in vivo protein abundances by coupling constraint-based modelling and machine learning.

Autor: Moura Ferreira MA; Department of Microbiology, Federal University of Viçosa, Viçosa, Minas Gerais, 36570900, Brazil., Wendering P; Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, 14476, Germany; Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, Potsdam, 14476, Germany., Arend M; Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, 14476, Germany; Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, Potsdam, 14476, Germany., Batista da Silveira W; Department of Microbiology, Federal University of Viçosa, Viçosa, Minas Gerais, 36570900, Brazil., Nikoloski Z; Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, 14476, Germany; Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, Potsdam, 14476, Germany. Electronic address: zoran.nikoloski@uni-potsdam.de.
Jazyk: angličtina
Zdroj: Metabolic engineering [Metab Eng] 2023 Nov; Vol. 80, pp. 184-192. Date of Electronic Publication: 2023 Oct 05.
DOI: 10.1016/j.ymben.2023.09.014
Abstrakt: Quantification of how different environmental cues affect protein allocation can provide important insights for understanding cell physiology. While absolute quantification of proteins can be obtained by resource-intensive mass-spectrometry-based technologies, prediction of protein abundances offers another way to obtain insights into protein allocation. Here we present CAMEL, a framework that couples constraint-based modelling with machine learning to predict protein abundance for any environmental condition. This is achieved by building machine learning models that leverage static features, derived from protein sequences, and condition-dependent features predicted from protein-constrained metabolic models. Our findings demonstrate that CAMEL results in excellent prediction of protein allocation in E. coli (average Pearson correlation of at least 0.9), and moderate performance in S. cerevisiae (average Pearson correlation of at least 0.5). Therefore, CAMEL outperformed contending approaches without using molecular read-outs from unseen conditions and provides a valuable tool for using protein allocation in biotechnological applications.
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2023 The Authors. Published by Elsevier Inc. All rights reserved.)
Databáze: MEDLINE