Constrained distance based clustering for time-series: a comparative and experimental study.

Autor: Lampert, Thomas, Dao, Thi-Bich-Hanh, Lafabregue, Baptiste, Serrette, Nicolas, Forestier, Germain, Crémilleux, Bruno, Vrain, Christel, Gançarski, Pierre
Předmět:
Zdroj: Data Mining & Knowledge Discovery; Nov2018, Vol. 32 Issue 6, p1663-1707, 45p
Abstrakt: Constrained clustering is becoming an increasingly popular approach in data mining. It offers a balance between the complexity of producing a formal definition of thematic classes—required by supervised methods—and unsupervised approaches, which ignore expert knowledge and intuition. Nevertheless, the application of constrained clustering to time-series analysis is relatively unknown. This is partly due to the unsuitability of the Euclidean distance metric, which is typically used in data mining, to time-series data. This article addresses this divide by presenting an exhaustive review of constrained clustering algorithms and by modifying publicly available implementations to use a more appropriate distance measure—dynamic time warping. It presents a comparative study, in which their performance is evaluated when applied to time-series. It is found that k-means based algorithms become computationally expensive and unstable under these modifications. Spectral approaches are easily applied and offer state-of-the-art performance, whereas declarative approaches are also easily applied and guarantee constraint satisfaction. An analysis of the results raises several influencing factors to an algorithm’s performance when constraints are introduced. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index