Unsupervised Ground Metric Learning using Wasserstein Eigenvectors
Autor: | Huizing, Geert-Jan, Cantini, Laura, Peyré, Gabriel |
---|---|
Přispěvatelé: | Institut de biologie de l'ENS Paris (UMR 8197/1024) (IBENS), Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Département de Biologie - ENS Paris, École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Cantini, Laura, Institut de biologie de l'ENS Paris (IBENS), Département de Biologie - ENS Paris, École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Département de Mathématiques et Applications - ENS Paris (DMA), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Département de Biologie - ENS Paris |
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: | |
Zdroj: | Proceedings of the 39th International Conference on Machine Learning Proceedings of the 39th International Conference on Machine Learning, Jul 2022, Baltimore, United States |
Popis: | International audience; Optimal Transport (OT) defines geometrically meaningful "Wasserstein" distances, used in machine learning applications to compare probability distributions. However, a key bottleneck is the design of a "ground" cost which should be adapted to the task under study. In most cases, supervised metric learning is not accessible, and one usually resorts to some ad-hoc approach. Unsupervised metric learning is thus a fundamental problem to enable data-driven applications of Optimal Transport. In this paper, we propose for the first time a canonical answer by computing the ground cost as a positive eigenvector of the function mapping a cost to the pairwise OT distances between the inputs. This map is homogeneous and monotone, thus framing unsupervised metric learning as a non-linear Perron-Frobenius problem. We provide criteria to ensure the existence and uniqueness of this eigenvector. In addition, we introduce a scalable computational method using entropic regularization, which-in the large regularization limit-operates a principal component analysis dimensionality reduction. We showcase this method on synthetic examples and datasets. Finally, we apply it in the context of biology to the analysis of a high-throughput single-cell RNA sequencing (scRNAseq) dataset, to improve cell clustering and infer the relationships between genes in an unsupervised way. |
Databáze: | OpenAIRE |
Externí odkaz: |