Spatial and spatiotemporal modelling of intra-urban ultrafine particles: A comparison of linear, nonlinear, regularized, and machine learning methods.

Autor: Vachon J; Department of Environmental and Occupational Health, School of Public Health, University of Montreal, Montreal, Canada; Center for Public Health Research (CReSP), University of Montreal and CIUSSS du Centre-Sud-de-l'Île-de-Montréal, Montreal, Canada., Buteau S; Department of Environmental and Occupational Health, School of Public Health, University of Montreal, Montreal, Canada; Center for Public Health Research (CReSP), University of Montreal and CIUSSS du Centre-Sud-de-l'Île-de-Montréal, Montreal, Canada., Liu Y; Department of Environmental and Occupational Health, School of Public Health, University of Montreal, Montreal, Canada., Van Ryswyk K; Air Pollution Exposure Science Section, Water and Air Quality Bureau, Health Canada, Ottawa, Canada., Hatzopoulou M; Department of Civil Engineering, University of Toronto, Toronto, Canada., Smargiassi A; Department of Environmental and Occupational Health, School of Public Health, University of Montreal, Montreal, Canada; Center for Public Health Research (CReSP), University of Montreal and CIUSSS du Centre-Sud-de-l'Île-de-Montréal, Montreal, Canada. Electronic address: audrey.smargiassi@umontreal.ca.
Jazyk: angličtina
Zdroj: The Science of the total environment [Sci Total Environ] 2024 Dec 01; Vol. 954, pp. 176523. Date of Electronic Publication: 2024 Sep 24.
DOI: 10.1016/j.scitotenv.2024.176523
Abstrakt: Background: Machine learning methods are proposed to improve the predictions of ambient air pollution, yet few studies have compared ultrafine particles (UFP) models across a broad range of statistical and machine learning approaches, and only one compared spatiotemporal models. Most reported marginal differences between methods. This limits our ability to draw conclusions about the best methods to model ambient UFPs.
Objective: To compare the performance and predictions of statistical and machine learning methods used to model spatial and spatiotemporal ambient UFPs.
Methods: Daily and annual models were developed from UFP measurements from a year-long mobile monitoring campaign in Quebec City, Canada, combined with 262 geospatial and six meteorological predictors. Various road segment lengths were considered (100/300/500 m) for UFP data aggregation. Four statistical methods included linear, non-linear, and regularized regressions, whereas eight machine learning regressions utilized tree-based, neural networks, support vector, and kernel ridge algorithms. Nested cross-validation was used for model training, hyperparameter tuning and performance evaluation.
Results: Mean annual UFP concentrations was 13,335 particles/cm 3 . Machine learning outperformed statistical methods in predicting UFPs. Tree-based methods performed best across temporal scales and segment lengths, with XGBoost producing the overall best performing models (annual R 2  = 0.78-0.86, RMSE = 2163-2169 particles/cm 3 ; daily R 2  = 0.47-0.48, RMSE = 8651-11,422 particles/cm 3 ). With 100 m segments, other annual models performed similarly well, but their prediction surfaces of annual mean UFP concentrations showed signs of overfitting. Spatial aggregation of monitoring data significantly impacted model performance. Longer segments yielded lower RMSE in all daily models and for annual statistical models, but not for annual machine learning models.
Conclusions: The use of tree-based methods significantly improved spatiotemporal predictions of UFP concentrations, and to a lesser extent annual concentrations. Segment length and hyperparameter tuning had notable impacts on model performance and should be considered in future studies.
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved.)
Databáze: MEDLINE