Inferring a complete genotype-phenotype map from a small number of measured phenotypes
Autor: | Alex Joule, Alice Patterson-Robert, Zachary R. Sailer, Michael J. Harms, Rowena E. Martin, Sarah H. Shafik, Robert L. Summers |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
Heredity Xenopus Protozoan Proteins Mathematical and Statistical Techniques 0302 clinical medicine Genotype Biology (General) Protozoans Ecology Small number Statistics Malarial Parasites Uncertainty Eukaryota Animal Models Phenotype Experimental Organism Systems Fitness Epistasis Computational Theory and Mathematics Modeling and Simulation Xenopus Oocytes Vertebrates Physical Sciences Mutation (genetic algorithm) Frogs Combinatorial map Research Article QH301-705.5 Plasmodium falciparum Computational biology Biology Research and Analysis Methods Models Biological Amphibians 03 medical and health sciences Cellular and Molecular Neuroscience Model Organisms Genetics Animals Point Mutation Parasite Evolution Statistical Methods Molecular Biology Ecology Evolution Behavior and Systematics Organisms Biology and Life Sciences Statistical model Parasitic Protozoans 030104 developmental biology Mutation Animal Studies Epistasis Parasitology Scale (map) Zoology Mathematics 030217 neurology & neurosurgery Forecasting |
Zdroj: | PLoS Computational Biology, Vol 16, Iss 9, p e1008243 (2020) PLoS Computational Biology |
ISSN: | 1553-7358 |
Popis: | Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite’s ‘chloroquine resistance transporter’ (PfCRT). Wild-type PfCRT (PfCRT3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the ‘Dd2’ isoform of PfCRT (PfCRTDd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT3D7 and PfCRTDd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT3D7 and PfCRTDd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R2 value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT3D7 and PfCRTDd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps. Author summary Biological macromolecules are built from chains of building blocks. The function of a macromolecule depends on the specific chemical properties of the building blocks that make it up. Macromolecules evolve through mutations that swap one building block for another. Understanding how biomolecules work and evolve therefore requires knowledge of the effects of mutations. The effects of mutations can be measured experimentally; however, because there are a vast number of possible combinations of mutations, it is often difficult to make enough measurements to understand biomolecular function and evolution. In this paper, we describe a simple method to predict the effects of mutations on biomolecules from a small number of measurements. This method works by appropriately averaging the effects of mutations seen in different contexts. We test the method by predicting the effects of mutations on a PfCRT—a macromolecule from the malarial parasite that confers drug resistance. We find that our method is fast and effective. Using a small number of measurements, we were able to gain insight into the evolutionary steps by which this macromolecule conferred drug resistance. To make this method accessible to other researchers, we have released it as an open-source software package: https://gpseer.readthedocs.io. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |