Functional data clustering via hypothesis testing k-means
Autor: | Adriano Zanin Zambom, Julian A. A. Collazos, Ronaldo Dias |
---|---|
Rok vydání: | 2018 |
Předmět: |
Statistics and Probability
Computer science business.industry 05 social sciences k-means clustering Pattern recognition 01 natural sciences Measure (mathematics) Partition (database) 010104 statistics & probability Computational Mathematics 0502 economics and business Cluster (physics) Artificial intelligence 0101 mathematics Statistics Probability and Uncertainty Unsupervised clustering Cluster analysis business Smoothing 050205 econometrics Statistical hypothesis testing |
Zdroj: | Computational Statistics. 34:527-549 |
ISSN: | 1613-9658 0943-4062 |
Popis: | Functional data clustering procedures seek to identify subsets of curves with similar shapes and estimate representative mean curves of each such subset. In this work, we propose a new approach for functional data clustering based on a combination of a hypothesis test of parallelism and the test for equality of means. These tests use all observations, which come from an underlying functional model, to compute a measure that determines to which smoothed cluster center each subject’s data belongs. This measure is incorporated into a modified k-means algorithm to partition subjects into clusters and find the cluster centers. While competing algorithms require a fixed amount of smoothing for all curves, the proposed test-based procedure performs unsupervised clustering to curves with different degrees of smoothing. Extensive numerical experiments were examined and the results on simulated and real datasets suggest that the proposed algorithm outperforms other clustering approaches in most cases. |
Databáze: | OpenAIRE |
Externí odkaz: |