An efficient computational investigation on accurate daily soil temperature prediction using boosting ensemble methods explanation based on SHAP importance analysis

Autor: Meysam Alizamir, Mo Wang, Rana Muhammad Adnan Ikram, Kaywan Othman Ahmed, Salim Heddam, Sungwon Kim
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Results in Engineering, Vol 24, Iss , Pp 103220- (2024)
Druh dokumentu: article
ISSN: 2590-1230
DOI: 10.1016/j.rineng.2024.103220
Popis: Accurately predicting soil temperature (Ts) serves as the foundation of geothermal applications, modern irrigation strategies in arid agricultural landscapes, and understanding ecosystem changes. Also, this parameter is crucial for estimating crop water requirements, thereby enabling efficient management of scarce water resources in these moisture-limited environments. Therefore, the development of a sophisticated intelligent algorithm based on boosting ensemble including XGBoost, CatBoost, LightGBM, and AdaBoost models which incorporates nine different scenarios from meteorological variables as input parameters, offers valuable insights into subsurface thermal dynamics at various depths for accurately predicting thermal gradients within the soil profile, is imperative for soil science investigations. To unravel the underlying mechanisms influencing the models' predictions of Ts, the SHapley Additive exPlanations (SHAP) methodology was employed. This algorithm quantifies the contributory significance of each input variable and facilitates the explication of input-output dependencies. In this study, the model's performance was evaluated at two meteorological monitoring sites, denoted as Penjwen and Bazian, situated within the geopolitical boundaries of the Kurdistan region in Iraq using a suite of statistical indicators, including the correlation coefficient (R), the root mean square error (RMSE), the Nash-Sutcliffe efficiency (NSE), and the mean absolute error (MAE). These metrics provided a comprehensive assessment of the model's predictive accuracy and reliability. Moreover, this localized analysis provided insights into the model's efficacy under specific regional climatic conditions. At station Penjwen, the results based on the RMSE values indicated that the LightGBM and CatBoost methods performed better than other models in Ts estimation at depths of 5 cm and 10 cm, with RMSE values of 2.502 °C and 2.164 °C, respectively. Also, at station Bazian, CatBoost and LightGBM approaches showed the best results at depths of 5 cm and 10 cm, with RMSE values of 2.069 °C and 1.786 °C, respectively. The study's findings suggest that meteorological variables can serve as effective inputs for predicting Ts using the proposed algorithms.
Databáze: Directory of Open Access Journals