AMaLa: Analysis of Directed Evolution Experiments via Annealed Mutational Approximated Landscape

Autor: Jorge Fernandez-de-Cossio-Diaz, Andrea Pagnani, Guido Uguzzoni, Luca Sesta
Přispěvatelé: Department of Applied Science and Technology [Politecnico di Torino] (DISAT), Politecnico di Torino = Polytechnic of Turin (Polito), Laboratoire de physique de l'ENS - ENS Paris (LPENS), Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP)-Sorbonne Université (SU)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Physique Statistique et Inférence pour la Biologie, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP)-Sorbonne Université (SU)-École normale supérieure - Paris (ENS Paris), Laboratoire de physique de l'ENS - ENS Paris (LPENS (UMR_8023)), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP)-École normale supérieure - Paris (ENS Paris)
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Directed Evolution
fitness landscape
Fitness landscape
Computer science
QH301-705.5
statistical modeling
Inference
Statistical weight
Catalysis
Article
Inorganic Chemistry
Evolution
Molecular

03 medical and health sciences
0302 clinical medicine
computational biology
[SDV.BBM]Life Sciences [q-bio]/Biochemistry
Molecular Biology

Physical and Theoretical Chemistry
Biology (General)
Molecular Biology
QD1-999
Spectroscopy
Selection (genetic algorithm)
030304 developmental biology
0303 health sciences
Models
Genetic

Organic Chemistry
High-Throughput Nucleotide Sequencing
Statistical model
General Medicine
Sequence Analysis
DNA

Directed evolution
Deep Mutational Scanning
Computer Science Applications
Chemistry
Mutation (genetic algorithm)
Mutation
Sequence space (evolution)
Genetic Fitness
Directed Molecular Evolution
direct-coupling analysis
Algorithm
030217 neurology & neurosurgery
Algorithms
Zdroj: International Journal of Molecular Sciences
International Journal of Molecular Sciences, MDPI, 2021, 22 (20), pp.10908. ⟨10.3390/ijms222010908⟩
International Journal of Molecular Sciences, Vol 22, Iss 10908, p 10908 (2021)
Volume 22
Issue 20
ISSN: 1661-6596
1422-0067
DOI: 10.3390/ijms222010908⟩
Popis: International audience; We present Annealed Mutational approximated Landscape (AMaLa), a new method to infer fitness landscapes from Directed Evolution experiments sequencing data. Such experiments typically start from a single wild-type sequence, which undergoes Darwinian in vitro evolution via multiple rounds of mutation and selection for a target phenotype. In the last years, Directed Evolution is emerging as a powerful instrument to probe fitness landscapes under controlled experimental conditions and as a relevant testing ground to develop accurate statistical models and inference algorithms (thanks to high-throughput screening and sequencing). Fitness landscape modeling either uses the enrichment of variants abundances as input, thus requiring the observation of the same variants at different rounds or assuming the last sequenced round as being sampled from an equilibrium distribution. AMaLa aims at effectively leveraging the information encoded in the whole time evolution. To do so, while assuming statistical sampling independence between sequenced rounds, the possible trajectories in sequence space are gauged with a time-dependent statistical weight consisting of two contributions: (i) an energy term accounting for the selection process and (ii) a generalized Jukes–Cantor model for the purely mutational step. This simple scheme enables accurately describing the Directed Evolution dynamics and inferring a fitness landscape that correctly reproduces the measures of the phenotype under selection (e.g., antibiotic drug resistance), notably outperforming widely used inference strategies. In addition, we assess the reliability of AMaLa by showing how the inferred statistical model could be used to predict relevant structural properties of the wild-type sequence.
Databáze: OpenAIRE