Estimation of dissolved organic carbon from inland waters at a large scale using satellite data and machine learning methods

Autor: Lasse Harkort, Zheng Duan
Rok vydání: 2022
Předmět:
Zdroj: Water research. 229
ISSN: 1879-2448
Popis: Dissolved Organic Carbon (DOC) in inland waters plays an essential role in the global carbon cycle and has significant public health effects. Machine learning (ML) together with remote sensing has emerged as a powerful and promising combination to quantify water quality parameters from space. However, inland water sample data for DOC is limited. Hence, little is known about the potential to quantify DOC content in inland waters, especially over large-scale areas. This study presents the first attempt to estimate DOC in inland waters over a large-scale area using satellite data and ML methods with the newly published open-source dataset AquaSat. Four ML approaches, namely Random Forest Regression (RFR), Support Vector Regression (SVR), Gaussian Process Regression (GPR), and a Multilayer Backpropagation Neural Network (MBPNN) were trained using more than 16 thousand samples across the continental United States matched with satellite data from Landsat 5, 7 and 8 missions. Satellite data from the Landsat missions were further extended with environmental data from the ERA5-Land product and used as input to train the ML algorithms. Our results show that including environmental data as inputs considerably improved the prediction of DOC for all ML algorithms, with GPR showing the most promising performance results with moderate estimation errors (RMSE: 4.08 mg/L). Permutation feature importance analysis showed that the wavelength range in the visible Green band (from Landsat) and the monthly average air temperature (from ERA5-Land) were the most important variables for the ML approaches. The results demonstrate the predictive strength of GPR and its useful feature to derive per pixel standard deviations for detailed analysis. Our results further highlight the important role of considering environmental processes to explain DOC variations over large scales. The application and performance of the GPR in mapping spatiotemporal variations of DOC in an entire water body were discussed by taking Lake Okeechobee (the 8
Databáze: OpenAIRE