Machine Learning Modelling and Feature Engineering in Seismology Experiment

Autor: Szymon Wojciechowski, Ivan Petryshynets, Khaled Giasin, V. G. Efremenko, Serhii Anatolievich Sylenko, Catalin I. Pruncu, Daniel Yurievich Pimenov, M. N. Brykov
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Feature engineering
010504 meteorology & atmospheric sciences
Computer science
0805 Distributed Computing
lcsh:Chemical technology
010502 geochemistry & geophysics
Machine learning
computer.software_genre
seismology
01 natural sciences
Biochemistry
Article
Analytical Chemistry
Feature (machine learning)
lcsh:TP1-1185
acoustic data
0502 Environmental Science and Management
Electrical and Electronic Engineering
Instrumentation
0105 earth and related environmental sciences
0602 Ecology
business.industry
Work (physics)
laboratory experiment
artificial intelligence
Atomic and Molecular Physics
and Optics

0906 Electrical and Electronic Engineering
earthquake prediction
feature engineering
machine learning
Metric (mathematics)
Artificial intelligence
business
0301 Analytical Chemistry
computer
Seismology
Zdroj: Sensors
Volume 20
Issue 15
Brykov, M N, Petryshynets, I, Pruncu, C I, Efremenko, V G, Pimenov, D Y, Giasin, K, Sylenko, S A & Wojciechowski, S 2020, ' Machine learning modelling and feature engineering in seismology experiment ', Sensors, vol. 20, no. 15, 4228 . https://doi.org/10.3390/s20154228
Sensors (Basel, Switzerland)
Sensors, Vol 20, Iss 4228, p 4228 (2020)
ISSN: 1424-8220
DOI: 10.3390/s20154228
Popis: This article aims to discusses machine learning modelling using a dataset provided by the LANL (Los Alamos National Laboratory) earthquake prediction competition hosted by Kaggle. The data were obtained from a laboratory stick-slip friction experiment that mimics real earthquakes. Digitized acoustic signals were recorded against time to failure of a granular layer compressed between steel plates. In this work, machine learning was employed to develop models that could predict earthquakes. The aim is to highlight the importance and potential applicability of machine learning in seismology The XGBoost algorithm was used for modelling combined with 6-fold cross-validation and the mean absolute error (MAE) metric for model quality estimation. The backward feature elimination technique was used followed by the forward feature construction approach to find the best combination of features. The advantage of this feature engineering method is that it enables the best subset to be found from a relatively large set of features in a relatively short time. It was confirmed that the proper combination of statistical characteristics describing acoustic data can be used for effective prediction of time to failure. Additionally, statistical features based on the autocorrelation of acoustic data can also be used for further improvement of model quality. A total of 48 statistical features were considered. The best subset was determined as having 10 features. Its corresponding MAE was 1.913 s, which was stable to the third decimal point. The presented results can be used to develop artificial intelligence algorithms devoted to earthquake prediction.
Databáze: OpenAIRE