Statistical analysis of a hierarchical clustering algorithm with outliers
Autor: | Nicolas Klutchnikoff, Audrey Poterie, Laurent Rouvière |
---|---|
Přispěvatelé: | Université de Bretagne Sud (UBS), Institut de Recherche Mathématique de Rennes (IRMAR), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-École normale supérieure - Rennes (ENS Rennes)-Université de Rennes 2 (UR2)-Centre National de la Recherche Scientifique (CNRS)-Institut Agro Rennes Angers, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro) |
Rok vydání: | 2022 |
Předmět: | |
Zdroj: | Journal of Multivariate Analysis Journal of Multivariate Analysis, 2022, 192, pp.article n° 105075. ⟨10.1016/j.jmva.2022.105075⟩ |
ISSN: | 0047-259X 1095-7243 |
DOI: | 10.48550/arxiv.2203.09781 |
Popis: | International audience; It is well known that the classical single linkage algorithm usually fails to identify clusters in the presence of outliers. In this paper, we propose a new version of this algorithm, and we study its mathematical performances. In particular, we establish an oracle type inequality which ensures that our procedure allows to recover the clusters with large probability under minimal assumptions on the distribution of the outliers. We deduce from this inequality the consistency and some rates of convergence of our algorithm for various situations. Performances of our approach is also assessed through simulation studies and a comparison with classical clustering algorithms on simulated data is also presented. |
Databáze: | OpenAIRE |
Externí odkaz: |