Popis: |
Water management planning requires reliable and accurate water demand forecasting. Water demand prediction is affected by variables, such as climate, socio-economic, and demographic data. This paper investigates urban monthly average water demand prediction, using classical, ensemble, and gradient boosting-based machine learning models, using the available monthly water demand, climatic, economic, and demographic data. Three train-test data split schemes on water demand timeseries were considered to determine the effect of data size on water demand prediction. Sensitivity analysis was employed to reduce input feature dimensionality while maintaining model accuracy. A univariate timeseries (water demand only) produced R2 scores up to 0.91, which increased to 0.94 with the addition of calendar and climatic features. Increasing the training data size from 70% to 90% improved the RMSE and MAE scores by ensemble and gradient boosting methods, with the random forest and the AdaBoost models showing improvements of up to 69%. The sensitivity analysis revealed a successful input reduction scheme from a potential 17 input attributes to seven inputs. Gradient boosting models showed robust and faster execution time, especially with the increase in training data, which is attractive for medium-term urban water demand forecasting. |