Assessing the Role of Temporal Information in Modelling Short-Term Air Pollution Effects Based on Traffic and Meteorological Conditions: A Case Study in Wrocław
Autor: | Guido Sciavicco, Enrico Marzano, Tomasz Turek, Andrea Brunello, Joanna Kamińska, Angelo Montanari |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
050101 languages & linguistics
Variables Computer science media_common.quotation_subject 05 social sciences 02 engineering and technology Traffic flow computer.software_genre Temporal database Term (time) Information extraction 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Data mining Time series computer Categorical variable Decision tree model media_common |
Zdroj: | Communications in Computer and Information Science ISBN: 9783030302771 ADBIS (Short Papers and Workshops) |
Popis: | The temporal aspects often play an important role in information extraction. Given the peculiarities of temporal data, their management typically requires the use of dedicated algorithms, that make the overall data mining process complex, especially in those cases in which a dataset is characterised by both temporal and atemporal information. In such a situation, typical solutions include combining different algorithms for the independent handling of the temporal and atemporal parts, or relying on an encoding of temporal data that makes it possible to apply classical machine learning algorithms (such as with the use of lagged variables). This work investigates the management of temporal information in an environmental problem, that is, assessing the relationships between concentrations of the pollutants \(NO_2\), \(NO_X\), and \(PM_{2.5}\), and a set of independent variables that include meteorological conditions and traffic flow in the city of Wroclaw (Poland). We show that taking into account temporal information by means of lagged variables leads to better results with respect to atemporal models. More importantly, an even higher performance may be achieved by making use of a recently proposed decision tree model, called J48SS, that is capable of handling heterogeneous datasets consisting of static (i.e., categorical and numerical) attributes, as well as sequential and time series data. Such an outcome highlights the importance of proper temporal data modelling. |
Databáze: | OpenAIRE |
Externí odkaz: |