Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO 2 (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain).

Autor: González-Enrique J; Intelligent Modelling of Systems Research Group (MIS), Department of Computer Science Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, Spain., Ruiz-Aguilar JJ; Intelligent Modelling of Systems Research Group (MIS), Department of Industrial and Civil Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, Spain., Moscoso-López JA; Intelligent Modelling of Systems Research Group (MIS), Department of Industrial and Civil Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, Spain., Urda D; Grupo de Inteligencia Computacional Aplicada (GICAP), Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad de Burgos, Av. Cantabria s/n, 09006 Burgos, Spain., Deka L; The De Montfort University Interdisciplinary Group in Intelligent Transport Systems (DIGITS), Department of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UK., Turias IJ; Intelligent Modelling of Systems Research Group (MIS), Department of Computer Science Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, Spain.
Jazyk: angličtina
Zdroj: Sensors (Basel, Switzerland) [Sensors (Basel)] 2021 Mar 04; Vol. 21 (5). Date of Electronic Publication: 2021 Mar 04.
DOI: 10.3390/s21051770
Abstrakt: This study aims to produce accurate predictions of the NO 2 concentrations at a specific station of a monitoring network located in the Bay of Algeciras (Spain). Artificial neural networks (ANNs) and sequence-to-sequence long short-term memory networks (LSTMs) were used to create the forecasting models. Additionally, a new prediction method was proposed combining LSTMs using a rolling window scheme with a cross-validation procedure for time series (LSTM-CVT). Two different strategies were followed regarding the input variables: using NO 2 from the station or employing NO 2 and other pollutants data from any station of the network plus meteorological variables. The ANN and LSTM-CVT exogenous models used lagged datasets of different window sizes. Several feature ranking methods were used to select the top lagged variables and include them in the final exogenous datasets. Prediction horizons of t + 1, t + 4 and t + 8 were employed. The exogenous variables inclusion enhanced the model's performance, especially for t + 4 ( ρ ≈ 0.68 to ρ ≈ 0.74) and t + 8 ( ρ ≈ 0.59 to ρ ≈ 0.66). The proposed LSTM-CVT method delivered promising results as the best performing models per prediction horizon employed this new methodology. Additionally, per each parameter combination, it obtained lower error values than ANNs in 85% of the cases.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje