Anomaly Detection Using a Sliding Window Technique and Data Imputation with Machine Learning for Hydrological Time Series
Autor: | Kanoksri Sarinnapakorn, Montri Maleewong, Chantana Chantrapornchai, Papis Wongchaisuwat, Surajate Boonya-aroonnet, Supaluk Wimala, Lattawit Kulanuwat |
---|---|
Rok vydání: | 2021 |
Předmět: |
Computer science
data imputation Geography Planning and Development 02 engineering and technology sliding window 010501 environmental sciences Aquatic Science 01 natural sciences Biochemistry water management 020204 information systems Sliding window protocol 0202 electrical engineering electronic engineering information engineering Imputation (statistics) TD201-500 0105 earth and related environmental sciences Water Science and Technology Water supply for domestic and industrial purposes business.industry Anomaly (natural sciences) Pattern recognition Hydraulic engineering anomaly detection Data point Outlier median absolute deviation Anomaly detection Artificial intelligence time series LSTM TC1-978 business Spline interpolation Interpolation |
Zdroj: | Water, Vol 13, Iss 1862, p 1862 (2021) Water Volume 13 Issue 13 |
ISSN: | 2073-4441 |
Popis: | Water level data obtained from telemetry stations typically contains large number of outliers. Anomaly detection and a data imputation are necessary steps in a data monitoring system. Anomaly data can be detected if its values lie outside of a normal pattern distribution. We developed a median-based statistical outlier detection approach using a sliding window technique. In order to fill anomalies, various interpolation techniques were considered. Our proposed framework exhibited promising results after evaluating with F1-score and root mean square error (RMSE) based on our artificially induced data points. The present system can also be easily applied to various patterns of hydrological time series with diverse choices of internal methods and fine-tuned parameters. Specifically, the Spline interpolation method yielded a superior performance on non-cyclical data while the long short-term memory (LSTM) outperformed other interpolation methods on a distinct tidal data pattern. |
Databáze: | OpenAIRE |
Externí odkaz: |