Popis: |
Monitoring of streamed data to detect abnormal behaviour (variously known as event detection, anomaly detection, change detection, or outlier detection) underlies many applications of the Internet of Things. There, one often collects data from a variety of sources, with asynchronous sampling, and missing data. In this setting, one can predict abnormal behavior using low-rank techniques. In particular, we assume that normal observations come from a low-rank subspace, prior to being corrupted by a uniformly distributed noise. Correspondingly, we aim to recover a representation of the subspace, and perform event detection by running point-to-subspace distance query for incoming data. In particular, we use a variant of low-rank factorisation, which considers interval uncertainty sets around "known entries", on a suitable flattening of the input data to obtain a low-rank model. On-line, we compute the distance of incoming data to the low-rank normal subspace and update the subspace to keep it consistent with the seasonal changes present. For the distance computation, we suggest to consider subsampling. We bound the one-sided error as a function of the number of coordinates employed using techniques from learning theory and computational geometry. In our experimental evaluation, we have tested the ability of the proposed algorithm to identify samples of abnormal behavior in induction-loop data from Dublin, Ireland. |