Uncovering Non-random Binary Patterns Within Sequences of Intrinsically Disordered Proteins.

Autor: Cohan MC; Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA., Shinn MK; Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA., Lalmansingh JM; Department of Physics, Washington University in St. Louis, MO 63130, USA., Pappu RV; Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA. Electronic address: pappu@wustl.edu.
Jazyk: angličtina
Zdroj: Journal of molecular biology [J Mol Biol] 2022 Jan 30; Vol. 434 (2), pp. 167373. Date of Electronic Publication: 2021 Dec 01.
DOI: 10.1016/j.jmb.2021.167373
Abstrakt: Sequence-ensemble relationships of intrinsically disordered proteins (IDPs) are governed by binary patterns such as the linear clustering or mixing of specific residues or residue types with respect to one another. To enable the discovery of potentially important, shared patterns across sequence families, we describe a computational method referred to as NARDINI for Non-random Arrangement of Residues in Disordered Regions Inferred using Numerical Intermixing. This work was partially motivated by the observation that parameters that are currently in use for describing different binary patterns are not interoperable across IDPs of different amino acid compositions and lengths. In NARDINI, we generate an ensemble of scrambled sequences to set up a composition-specific null model for the patterning parameters of interest. We then compute a series of pattern-specific z-scores to quantify how each pattern deviates from a null model for the IDP of interest. The z-scores help in identifying putative non-random linear sequence patterns within an IDP. We demonstrate the use of NARDINI derived z-scores by identifying sequence patterns in three well-studied IDP systems. We also demonstrate how NARDINI can be deployed to study archetypal IDPs across homologs and orthologs. Overall, NARDINI is likely to aid in designing novel IDPs with a view toward engineering new sequence-function relationships or uncovering cryptic ones. We further propose that the z-scores introduced here are likely to be useful for theoretical and computational descriptions of sequence-ensemble relationships across IDPs of different compositions and lengths.
Competing Interests: Declaration of interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2021 The Author(s). Published by Elsevier Ltd.. All rights reserved.)
Databáze: MEDLINE