Accounting for errors in data improves timing in single-cell cancer evolution

Autor: Kylie Chen, Jiří C. Moravec, Alex Gavryushkin, David Welch, Alexei J. Drummond
Rok vydání: 2021
Předmět:
Popis: Single-cell sequencing provides a new way to explore the evolutionary history of cancers. Compared to traditional bulk sequencing, which samples multiple heterogeneous cells, single-cell sequencing isolates and amplifies genetic material from a single cell. The ability to isolate a single cell makes it ideal for evolutionary inference. However, single-cell data is more error-prone due to the limited genomic material available per cell. Previous work using single-cell data to reconstruct the evolutionary history of cancers has not been integrated with standard evolutionary models. Here, we present error and mutation models for evolutionary inference of single-cell data within a mature and extensible Bayesian framework, BEAST2. Our framework enables integration with biologically informative models such as relaxed molecular clocks and population dynamic models. We reconstruct the phylogenetic history for a myeloproliferative cancer patient and two colorectal cancer patients. We find that the estimated times of terminal splitting events are shifted forward in time compared to models which ignore errors. Furthermore, we estimate 50% - 70% of the evolutionary distance between samples can be explained by sequencing error. Our simulation studies show that ignoring errors leads to inaccurate estimates of divergence times, mutation parameters and population parameters. Our work opens the potential for integrative Bayesian models capable of combining multiple sources of data. Author summary Cell heterogeneity is one of the hallmarks of cancer evolution. It is this cell-to-cell genotype variation that makes the analysis of traditional bulk samples very challenging. Recent advances in single-cell sequencing technology allows cell evolution to be traced directly, but is more error-prone than traditional methods. This motivates us to incorporate error models when inferring evolutionary history from genomic data. We used single-cell data to infer and time the evolutionary history of cancers whilst accounting for sequencing errors. Conventional methods for studying cancer evolution either do not account for errors or are limited in modeling scope. We present a mutation model that accounts for errors in a mature framework, BEAST2, which has inbuilt molecular clock models. We show errors can significantly impact the interpretation and timing of recent divergence events. Our results show sequencing errors can explain up to 50% - 70% of the genetic diversity within single-cell cancer samples.
Databáze: OpenAIRE