Gradient boosting machine learning to improve satellite-derived column water vapor measurement error
Autor: | Yujie Wang, Robert B. Chatfield, Meytar Sorek-Hamer, Michael Dorman, Johnathan Rush, Alexei Lyapustin, Yang Liu, Itai Kloog, Allan C. Just |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Atmospheric Science
Observational error 010504 meteorology & atmospheric sciences Mean squared error lcsh:TA715-787 lcsh:Earthwork. Foundations MAIAC Atmospheric correction 010501 environmental sciences Overfitting 01 natural sciences Article AERONET lcsh:Environmental engineering Machine Learning Environmental science Satellite Gradient boosting Moderate-resolution imaging spectroradiometer lcsh:TA170-171 CWV XGBoost 0105 earth and related environmental sciences Remote sensing |
Zdroj: | Atmospheric Measurement Techniques, Vol 13, Pp 4669-4681 (2020) Atmospheric measurement techniques |
ISSN: | 1867-8548 1867-1381 |
Popis: | The atmospheric products of the Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm include column water vapor (CWV) at 1 km resolution, derived from daily overpasses of NASA���s Moderate Resolution Imaging Spectroradiometer (MODIS) instruments aboard the Aqua and Terra satellites. We have recently shown that machine learning using extreme gradient boosting (XGBoost) can improve the estimation of MAIAC aerosol optical depth (AOD). Although MAIAC CWV is generally well validated (Pearson���s R >0.97 versus CWV from AERONET sun photometers), it has not yet been assessed whether machine-learning approaches can further improve CWV. Using a novel spatiotemporal cross-validation approach to avoid overfitting, our XGBoost model with nine features derived from land use terms, date, and ancillary variables from the MAIAC retrieval, quantifies and can correct a substantial portion of measurement error relative to collocated measures at AERONET sites (26.9% and 16.5% decrease in Root Mean Square Error (RMSE) for Terra and Aqua datasets, respectively) in the Northeastern USA, 2000-2015. We use machine-learning interpretation tools to illustrate complex patterns of measurement error and describe a positive bias in MAIAC Terra CWV worsening in recent summertime conditions. We validate our predictive model on MAIAC CWV estimates at independent stations from the SuomiNet GPS network where our corrections decrease the RMSE by 19.7% and 9.5% for Terra and Aqua MAIAC CWV. Empirically correcting for measurement error with machine-learning algorithms is a post-processing opportunity to improve satellite-derived CWV data for Earth science and remote sensing applications. # About the attachment # The zip file (CWV-project-repository.zip) is an R project which contains all the code (/Code) and data (/Data) needed to reproduce results. The folder (/Data) contains the JSON files that can be opened directly in browsers, text editor, or R using functions like `jsonlite::fromJSON`. The folder (/Intermediate) contains the intermediate cross-validation modeling results. If initiating R project using the _cwv_paper.Rproj_, the Rmarkdown file (mainly in _03_cwv_10by10cv_resultsmd.Rmd_) containing all the results (figures and tables) used in the paper could be reproduced. The html files are the results produced by Rmarkdown files. See # About the attachment # above. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |