Geodesic Sinkhorn: optimal transport for high-dimensional datasets

Autor: Maria Ramos Zapatero, Guillaume Huguet, Alexander Tong, Guy Wolf, Smita Krishnaswamy
Rok vydání: 2022
Předmět:
Zdroj: Maria Ramos Zapatero
DOI: 10.48550/arxiv.2211.00805
Popis: Understanding the dynamics and reactions of cells from population snapshots is a major challenge in single-cell transcriptomics. Here, we present Geodesic Sinkhorn, a method for interpolating populations along a data manifold that leverages existing kernels developed for single-cell dimensionality reduction and visualization methods. Our Geodesic Sinkhorn method uses a heat-geodesic ground distance that, as compared to Euclidean ground distances, is more accurate for interpolating single-cell dynamics on a wide variety of datasets and significantly speeds up the computation for sparse kernels. We first apply Geodesic Sinkhorn to 10 single-cell transcriptomics time series interpolation datasets as a drop-in replacement for existing interpolation methods where it outperforms on all datasets, showing its effectiveness in modeling cell dynamics. Second, we show how to efficiently approximate the operator with polynomial kernels allowing us to improve scaling to large datasets. Finally, we define the conditional Wasserstein-average treatment effect and show how it can elucidate the treatment effect on single-cell populations on a drug screen.
Comment: 15 pages, 5 tables, 5 figures, submitted to RECOMB 2023
Databáze: OpenAIRE