Popis: |
Air pollution contributes to millions of deaths worldwide every year. The concentration of a particular air pollutant, such as ozone, is controlled by physical and chemical processes which act on varying temporal and spatial scales. Quantifying the strength of causal drivers (e.g. temperature) on air pollution from observational data, particularly at extrema, is challenging due to the difficulty of disentangling correlation and causation, as many drivers are correlated. Furthermore, because air pollution is controlled in part by large scale atmospheric phenomena, using local (e.g. individual grid cell level) covariates for analysis is insufficient to fully capture the effect of these phenomena on air pollution. Access to large spatiotemporal datasets of air pollutant concentrations and atmospheric variables, coupled with recent advances in self-supervised learning, allow us to learn reduced representations of spatiotemporal atmospheric fields, and therefore account for non-local and non-instantaneous processes in downstream tasks. We show that these learned reduced representations can be useful for tasks such as air pollution forecasting, and crucially to quantify the causal effect of varying atmospheric fields on air pollution. We make use of recent advances in bounding causal effects in the presence of unobserved confounding to estimate, with uncertainty, the causal effect of changing atmospheric fields on air pollution. Finally, we compare our quantification of the causal drivers of air pollution to results from other approaches, and explore implications for our methods and for the wider goal of improving the process-level treatment of air pollutants in chemistry-climate models. |