Machine Learning Modelling and Feature Engineering in Seismology Experiment
Autor: | Szymon Wojciechowski, Ivan Petryshynets, Khaled Giasin, V. G. Efremenko, Serhii Anatolievich Sylenko, Catalin I. Pruncu, Daniel Yurievich Pimenov, M. N. Brykov |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Feature engineering
010504 meteorology & atmospheric sciences Computer science 0805 Distributed Computing lcsh:Chemical technology 010502 geochemistry & geophysics Machine learning computer.software_genre seismology 01 natural sciences Biochemistry Article Analytical Chemistry Feature (machine learning) lcsh:TP1-1185 acoustic data 0502 Environmental Science and Management Electrical and Electronic Engineering Instrumentation 0105 earth and related environmental sciences 0602 Ecology business.industry Work (physics) laboratory experiment artificial intelligence Atomic and Molecular Physics and Optics 0906 Electrical and Electronic Engineering earthquake prediction feature engineering machine learning Metric (mathematics) Artificial intelligence business 0301 Analytical Chemistry computer Seismology |
Zdroj: | Sensors Volume 20 Issue 15 Brykov, M N, Petryshynets, I, Pruncu, C I, Efremenko, V G, Pimenov, D Y, Giasin, K, Sylenko, S A & Wojciechowski, S 2020, ' Machine learning modelling and feature engineering in seismology experiment ', Sensors, vol. 20, no. 15, 4228 . https://doi.org/10.3390/s20154228 Sensors (Basel, Switzerland) Sensors, Vol 20, Iss 4228, p 4228 (2020) |
ISSN: | 1424-8220 |
DOI: | 10.3390/s20154228 |
Popis: | This article aims to discusses machine learning modelling using a dataset provided by the LANL (Los Alamos National Laboratory) earthquake prediction competition hosted by Kaggle. The data were obtained from a laboratory stick-slip friction experiment that mimics real earthquakes. Digitized acoustic signals were recorded against time to failure of a granular layer compressed between steel plates. In this work, machine learning was employed to develop models that could predict earthquakes. The aim is to highlight the importance and potential applicability of machine learning in seismology The XGBoost algorithm was used for modelling combined with 6-fold cross-validation and the mean absolute error (MAE) metric for model quality estimation. The backward feature elimination technique was used followed by the forward feature construction approach to find the best combination of features. The advantage of this feature engineering method is that it enables the best subset to be found from a relatively large set of features in a relatively short time. It was confirmed that the proper combination of statistical characteristics describing acoustic data can be used for effective prediction of time to failure. Additionally, statistical features based on the autocorrelation of acoustic data can also be used for further improvement of model quality. A total of 48 statistical features were considered. The best subset was determined as having 10 features. Its corresponding MAE was 1.913 s, which was stable to the third decimal point. The presented results can be used to develop artificial intelligence algorithms devoted to earthquake prediction. |
Databáze: | OpenAIRE |
Externí odkaz: |