A new method for prediction of air pollution based on intelligent computation.

Autor: Al-Janabi, Samaher, Mohammad, Mustafa, Al-Sultan, Ali
Předmět:
Zdroj: Soft Computing - A Fusion of Foundations, Methodologies & Applications; Jan2020, Vol. 24 Issue 1, p661-680, 20p
Abstrakt: The detection and treatment of increasing air pollution due to technological developments represent some of the most important challenges facing the world today. Indeed, there has been a significant increase in levels of environmental pollution in recent years. The aim of the work presented herein is to design an intelligent predictor for the concentrations of air pollutants over the next 2 days based on deep learning techniques using a recurrent neural network (RNN). The best structure for its operation is then determined using a particle swarm optimization (PSO) algorithm. The new predictor based on intelligent computation relying on unsupervised learning, i.e., long short-term memory (LSTM) and optimization (i.e., PSO), is called the smart air quality prediction model (SAQPM). The main goal is to predict six the concentrations of six types of air pollution, viz. PM2.5 particulate matter, PM10, particulate matter, nitrogen dioxide (NO2), carbon monoxide (CO), ozone (O3), and sulfur dioxide (SO2). SAQPM consists of four stages. The first stage involves data collection from multiple stations (35 in this case). The second stage involves preprocessing of the data, including (a) separation of each station with an independent focus, (b) handle missing values, and (c) normalization of the dataset to the range of (0, 1) using the MinMaxScalar method. The third stage relates to building the predictor based on the LSTM method by identifying the best structure and parameter values (weight, bias, number of hidden layers, number of nodes in each hidden layer, and activation function) for the network using the functional PSO algorithm to achieve a goal. Thereafter, the dataset is split into training and testing parts based on the ten cross-validation principle. The training dataset is then used to build the predictor. In the fourth stage, evaluation results for each station are obtained by reading the concentration of each pollutant each hour for at most 30 days then taking the average of the symmetric mean absolute percentage error (SMAPE) for 25 days only. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index