Time series anomaly detection based on shapelet learning
Autor: | Martin Schiegg, Bernd Bischl, Laura Beggel, Bernhard X. Kausler, Michael Pfeiffer |
---|---|
Rok vydání: | 2018 |
Předmět: |
Statistics and Probability
Series (mathematics) Computer science business.industry Feature vector 05 social sciences Pattern recognition Hypersphere 01 natural sciences 010104 statistics & probability Computational Mathematics ComputingMethodologies_PATTERNRECOGNITION Transformation (function) Feature (computer vision) 0502 economics and business Decision boundary Anomaly detection Artificial intelligence 0101 mathematics Statistics Probability and Uncertainty business 050205 econometrics Test data |
Zdroj: | Computational Statistics. 34:945-976 |
ISSN: | 1613-9658 0943-4062 |
DOI: | 10.1007/s00180-018-0824-9 |
Popis: | We consider the problem of learning to detect anomalous time series from an unlabeled data set, possibly contaminated with anomalies in the training data. This scenario is important for applications in medicine, economics, or industrial quality control, in which labeling is difficult and requires expensive expert knowledge, and anomalous data is difficult to obtain. This article presents a novel method for unsupervised anomaly detection based on the shapelet transformation for time series. Our approach learns representative features that describe the shape of time series stemming from the normal class, and simultaneously learns to accurately detect anomalous time series. An objective function is proposed that encourages learning of a feature representation in which the normal time series lie within a compact hypersphere of the feature space, whereas anomalous observations will lie outside of a decision boundary. This objective is optimized by a block-coordinate descent procedure. Our method can efficiently detect anomalous time series in unseen test data without retraining the model by reusing the learned feature representation. We demonstrate on multiple benchmark data sets that our approach reliably detects anomalous time series, and is more robust than competing methods when the training instances contain anomalous time series. |
Databáze: | OpenAIRE |
Externí odkaz: |