Reconstruction of GRACE Total Water Storage Through Automated Machine Learning

Autor: Alex Sun, Bridget Scanlon, Himanshu Save, Ashraf Rateb
Rok vydání: 2020
Popis: The GRACE satellite mission and its follow-on, GRACE-FO, have provided unprecedented opportunities to quantify the impact of climate extremes and human activities on total water storage at large scales. The approximately one-year data gap between the two GRACE missions needs to be filled to maintain data continuity and maximize mission benefits. There is strong interest in using machine learning (ML) algorithms to reconstruct GRACE-like data to fill this gap. So far, most studies attempted to train and select a single ML algorithm to work for global basins. However, hydrometeorological predictors may exhibit strong spatial variability which, in turn, may affect the performance of ML models. Existing studies have already shown that no single algorithm consistently outperformed others over all global basins. In this study, we applied an automated machine learning (AutoML) workflow to perform GRACE data reconstruction. AutoML represents a new paradigm for optimal model structure selection, hyperparameter tuning, and model ensemble stacking, addressing some of the most challenging issues related to ML applications. We demonstrated the AutoML workflow over the conterminous U.S. (CONUS) using six types of ML algorithms and multiple groups of meteorological and climatic variables as predictors. Results indicate that the AutoML-assisted gap filling achieved satisfactory performance over the CONUS. For the testing period (2014/06–2017/06), the mean gridwise Nash-Sutcliffe efficiency is around 0.85, the mean correlation coefficient is around 0.95, and the mean normalized root-mean square error is about 0.09. Trained models maintain good performance when extrapolating to the mission gap and to GRACE-FO periods (after 2017/06). Results further suggest that no single algorithm provides the best predictive performance over the entire CONUS, stressing the importance of using an end-to-end workflow to train, optimize, and combine multiple machine learning models to deliver robust performance, especially when building large-scale hydrological prediction systems and when predictor importance exhibits strong spatial variability.
Databáze: OpenAIRE