Multivariate monthly water demand prediction using ensemble and gradient boosting machine learning techniques

Autor: Paul Banda, Muhammed Bhuiyan, Kevin Zhang, Andy Song
Rok vydání: 2022
Zdroj: Proceedings of the International Conference on Evolving Cities.
ISSN: 2754-5768
DOI: 10.55066/proc-icec.2021.14
Popis: Water management planning requires reliable and accurate water demand forecasting. Water demand prediction is affected by variables, such as climate, socio-economic, and demographic data. This paper investigates urban monthly average water demand prediction, using classical, ensemble, and gradient boosting-based machine learning models, using the available monthly water demand, climatic, economic, and demographic data. Three train-test data split schemes on water demand timeseries were considered to determine the effect of data size on water demand prediction. Sensitivity analysis was employed to reduce input feature dimensionality while maintaining model accuracy. A univariate timeseries (water demand only) produced R2 scores up to 0.91, which increased to 0.94 with the addition of calendar and climatic features. Increasing the training data size from 70% to 90% improved the RMSE and MAE scores by ensemble and gradient boosting methods, with the random forest and the AdaBoost models showing improvements of up to 69%. The sensitivity analysis revealed a successful input reduction scheme from a potential 17 input attributes to seven inputs. Gradient boosting models showed robust and faster execution time, especially with the increase in training data, which is attractive for medium-term urban water demand forecasting.
Databáze: OpenAIRE