Popis: |
Accurate and reliable discharge estimation plays an important role in water resource management as well as downstream applications such as ecosystem conservation and flood control. Recently, data-driven machine learning (ML) techniques showed seemingly insurmountable performance in runoff forecasting and other geophysical domains, but they still need to be improved in terms of reliability and interpretability. In this study, focusing on discharge estimation and management, we developed an ML-based framework and applied it to the Huitanggou sluice hydrological station in Anhui Province, China. The framework contains two ML algorithms, the ensemble learning random forest (ELRF) and the ensemble learning gradient boosting decision tree (ELGBDT). The SHapley Additive exPlanation (SHAP) was introduced into our framework to interpret the impact of the model features. In our framework, the correlation analysis of the dataset can provide feature information for modeling, and the quartile method was utilized to solve the outlier problem of the dataset. The Bayesian optimization algorithm was adopted to optimize the hyperparameters of the ensemble ML models. The ensemble ML models are further compared with the traditional stage–discharge rating curve (SDRC) method and the single ML model. The results show that the estimation performance of the ensemble ML models is superior to that of the SDRC and the single ML model. In addition, an analysis of the discharge estimation without considering the flow state was performed. This analysis reveals that the ensemble ML models have strong adaptability. The ensemble ML models accurately estimate the discharge, with a coefficient of determination of 0.963, a root mean squared error of 31.268, and a coefficient of correlation of 0.984. Our framework can prove helpful to improve the efficiency of short-term hydrological estimation and simultaneously provide the interpretation of the impact of the hydrological features on estimation results. |