Interval forecasts based on regression trees for streaming data
Autor: | Stuart Barber, Charles C. Taylor, Zoka Milan, Xin Zhao |
---|---|
Rok vydání: | 2019 |
Předmět: |
Statistics and Probability
Computer science Test data generation Applied Mathematics Autoregressive conditional heteroskedasticity CPU time Inference 02 engineering and technology Interval (mathematics) 01 natural sciences Regression Computer Science Applications 010104 statistics & probability Tree (data structure) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Autoregressive integrated moving average 0101 mathematics Algorithm |
Zdroj: | Advances in Data Analysis and Classification. 15:5-36 |
ISSN: | 1862-5355 1862-5347 |
DOI: | 10.1007/s11634-019-00382-7 |
Popis: | In forecasting, we often require interval forecasts instead of just a specific point forecast. To track streaming data effectively, this interval forecast should reliably cover the observed data and yet be as narrow as possible. To achieve this, we propose two methods based on regression trees: one ensemble method and one method based on a single tree. For the ensemble method, we use weighted results from the most recent models, and for the single-tree method, we retain one model until it becomes necessary to train a new model. We propose a novel method to update the interval forecast adaptively using root mean square prediction errors calculated from the latest data batch. We use wavelet-transformed data to capture long time variable information and conditional inference trees for the underlying regression tree model. Results show that both methods perform well, having good coverage without the intervals being excessively wide. When the underlying data generation mechanism changes, their performance is initially affected but can recover relatively quickly as time proceeds. The method based on a single tree performs the best in computational (CPU) time compared to the ensemble method. When compared to ARIMA and GARCH modelling, our methods achieve better or similar coverage and width but require considerably less CPU time. |
Databáze: | OpenAIRE |
Externí odkaz: |