A multi-scale coevolutionary approach to predict interactions between protein domains.

Autor: Croce G; Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France., Gueudré T; Italian Institute for Genomic Medicine, Torino, Italy., Ruiz Cuevas MV; Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France., Keidel V; Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America., Figliuzzi M; Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France., Szurmant H; Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America., Weigt M; Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France.
Jazyk: angličtina
Zdroj: PLoS computational biology [PLoS Comput Biol] 2019 Oct 21; Vol. 15 (10), pp. e1006891. Date of Electronic Publication: 2019 Oct 21 (Print Publication: 2019).
DOI: 10.1371/journal.pcbi.1006891
Abstrakt: Interacting proteins and protein domains coevolve on multiple scales, from their correlated presence across species, to correlations in amino-acid usage. Genomic databases provide rapidly growing data for variability in genomic protein content and in protein sequences, calling for computational predictions of unknown interactions. We first introduce the concept of direct phyletic couplings, based on global statistical models of phylogenetic profiles. They strongly increase the accuracy of predicting pairs of related protein domains beyond simpler correlation-based approaches like phylogenetic profiling (80% vs. 30-50% positives out of the 1000 highest-scoring pairs). Combined with the direct coupling analysis of inter-protein residue-residue coevolution, we provide multi-scale evidence for direct but unknown interaction between protein families. An in-depth discussion shows these to be biologically sensible and directly experimentally testable. Negative phyletic couplings highlight alternative solutions for the same functionality, including documented cases of convergent evolution. Thereby our work proves the strong potential of global statistical modeling approaches to genome-wide coevolutionary analysis, far beyond the established use for individual protein complexes and domain-domain interactions.
Competing Interests: The authors have declared that no competing interests exist.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje