optimalFlow: optimal transport approach to flow cytometry gating and population matching

Autor: Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes, Carlos Matrán, Agustín Mayo-Íscar
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: BMC Bioinformatics, Vol 21, Iss 1, Pp 1-25 (2020)
Druh dokumentu: article
ISSN: 1471-2105
DOI: 10.1186/s12859-020-03795-w
Popis: Abstract Background Data obtained from flow cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well-known phenomenon produced by measurements on different individuals, with different characteristics such as illness, age, sex, etc. The use of different settings for measurement, the variation of the conditions during experiments and the different types of flow cytometers are some of the technical causes of variability. This mixture of sources of variability makes the use of supervised machine learning for identification of cell populations difficult. The present work is conceived as a combination of strategies to facilitate the task of supervised gating. Results We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusters cytometries and produces prototype cytometries for the different groups. We show that supervised learning, restricted to the new groups, performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code is freely available as optimalFlow, a Bioconductor R package at https://bioconductor.org/packages/optimalFlow . Conclusions optimalFlowTemplates + optimalFlowClassification addresses the problem of using supervised learning while accounting for biological and technical variability. Our methodology provides a robust automated gating workflow that handles the intrinsic variability of flow cytometry data well. Our main innovation is the methodology itself and the optimal transport techniques that we apply to flow cytometry analysis.
Databáze: Directory of Open Access Journals