Popis: |
In this research, a novel explainable multi-level ensemble learning framework has been developed to accurately monitor the greenhouse gas (GHG) emission drivers of the Atlantic Canada's potato crop system i.e., Carbon dioxide (CO2), nitrous oxide (N2O), and water vapour (H2O). For this purpose, alongside the GHG emission drivers, the hydro-meteorological and soil properties information was collected from three Canadian sites, two in Prince Edward Island (PEI) and one in New Brunswick. This advanced framework includes a transparent multi-level pre-processing module and a Runge-Kutta optimizer (RUN), integrated with an eXplainable gradient-boosted decision Tree (GBDT) machine learning (ML) technique. The preprocessing scheme meticulously selects the most effective input combinations from the hydro-meteorological and soil properties datasets using hybridization of Boruta-GBDT for feature selection, Best Subset Lasso Regression (BSLR), and Weighted Aggregated Sum Product Assessment (WASPAS). The optimal combinations were then analyzed using the GBDT-RUN and compared against two algorithms: LightGBM coupled with RUN optimizer (LightGBM-RUN) and classical GBDT. The explainability of the primary model was enhanced using SHapley Additive exPlanations (SHAP). Model validation employed various metrics, such as the correlation coefficient (R), squared deviation (SquD), and a range of sophisticated statistical graphics. Results demonstrated that the GBDT-RUN model exhibited superior performance in monitoring GHG emissions (CO2|R = 0.8431, SquD=17.1759, WASPAS=1.88E-07; N2O|R = 0.8431, SquD=17.1759, WASPAS=1.88E-07; H2O| R = 0.8431, SquD=17.1759, WASPAS=1.88E-07), outperforming both LightGBM-RUN and classical GBDT. Furthermore, the explainability analysis identified dew point and soil temperature as the most influential factors in the CO2, N2O, and H2O emissions scenarios. |