Time series motifs discovery under DTW allows more robust discovery of conserved structure

Autor:	Sara Alaee, Ryan Mercer, Kaveh Kamgar, Eamonn Keogh
Rok vydání:	2021
Předmět:	Dynamic time warping Theoretical computer science Similarity (geometry) Computational complexity theory Computer Networks and Communications Computer science Brute-force search 02 engineering and technology Automatic summarization Computer Science Applications Euclidean distance ComputingMethodologies_PATTERNRECOGNITION 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Segmentation Cluster analysis Information Systems
Zdroj:	Data Mining and Knowledge Discovery. 35:863-910
ISSN:	1573-756X 1384-5810
DOI:	10.1007/s10618-021-00740-0
Popis:	In recent years, time series motif discovery has emerged as perhaps the most important primitive for many analytical tasks, including clustering, classification, rule discovery, segmentation, and summarization. In parallel, it has long been known that Dynamic Time Warping (DTW) is superior to other similarity measures such as Euclidean Distance under most settings. However, due to the computational complexity of both DTW and motif discovery, virtually no research efforts have been directed at combining these two ideas. The current best mechanisms to address their lethargy appear to be mutually incompatible. In this work, we present the first efficient, scalable and exact method to find time series motifs under DTW. Our method automatically performs the best trade-off of time-to-compute versus tightness-of-lower-bounds for a novel hierarchy of lower bounds that we introduce. As we shall show through extensive experiments, our algorithm prunes up to 99.99% of the DTW computations under realistic settings and is up to three to four orders of magnitude faster than the brute force search, and two orders of magnitude faster than the only other competitor algorithm. This allows us to discover DTW motifs in massive datasets for the first time. As we will show, in many domains, DTW-based motifs represent semantically meaningful conserved behavior that would escape our attention using all existing Euclidean distance-based methods.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::9fccd661d235f02fe0d2c92072f63574 https://doi.org/10.1007/s10618-021-00740-0 Zobrazit plný text záznamu Full text from SpringerLink