Autor: |
Cordero JM; Universidad Politécnica de Madrid (UPM). ETSII-UPM, José Gutiérrez Abascal 2, 28006, Madrid, Spain. jm.cordero@upm.es., Rojo J; University of Castilla-La Mancha. Institute of Environmental Sciences (Botany), Avda. Carlos III s/n, E-45071, Toledo, Spain., Gutiérrez-Bustillo AM; Department of Pharmacology, Pharmacognosy and Botany, Complutense University of Madrid, Ciudad Universitaria, 28040, Madrid, Spain., Narros A; Universidad Politécnica de Madrid (UPM). ETSII-UPM, José Gutiérrez Abascal 2, 28006, Madrid, Spain., Borge R; Universidad Politécnica de Madrid (UPM). ETSII-UPM, José Gutiérrez Abascal 2, 28006, Madrid, Spain. |
Abstrakt: |
Air pollution in large cities produces numerous diseases and even millions of deaths annually according to the World Health Organization. Pollen exposure is related to allergic diseases, which makes its prediction a valuable tool to assess the risk level to aeroallergens. However, airborne pollen concentrations are difficult to predict due to the inherent complexity of the relationships among both biotic and environmental variables. In this work, a stochastic approach based on supervised machine learning algorithms was performed to forecast the daily Olea pollen concentrations in the Community of Madrid, central Spain, from 1993 to 2018. Firstly, individual Light Gradient Boosting Machine (LightGBM) and artificial neural network (ANN) models were applied to predict the day of the year (DOY) when the peak of the pollen season occurs, resulting the estimated average peak date 149.1 ± 9.3 and 150.1 ± 10.8 DOY for LightGBM and ANN, respectively, close to the observed value (148.8 ± 9.8). Secondly, the daily pollen concentrations during the entire pollen season have been calculated using an ensemble of two-step GAM followed by LightGBM and ANN. The results of the prediction of daily pollen concentrations showed a coefficient of determination (r 2 ) above 0.75 (goodness of the model following cross-validation). The predictors included in the ensemble models were meteorological variables, phenological metrics, specific site-characteristics, and preceding pollen concentrations. The models are state-of-the-art in machine learning and their potential has been shown to be used and deployed to understand and to predict the pollen risk levels during the main olive pollen season. |