An intrusion detection model to detect zero-day attacks in unseen data using machine learning.

Autor: Dai Z; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia., Por LY; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia., Chen YL; Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei Taiwan., Yang J; Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, Kuala Lumpur, Malaysia., Ku CS; Department of Computer Science, Universiti Tunku Abdul Rahman, Kampar, Malaysia., Alizadehsani R; Institute for Intelligent Systems Research and Innovation (IISRI) Deakin University, Waurn Ponds, Australia., Pławiak P; Department of Computer Science, Faculty of Computer Science and Telecommunications, Cracow University of Technology, Warszawska, Krakow, Poland.; Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka, Gliwice, Poland.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2024 Sep 11; Vol. 19 (9), pp. e0308469. Date of Electronic Publication: 2024 Sep 11 (Print Publication: 2024).
DOI: 10.1371/journal.pone.0308469
Abstrakt: In an era marked by pervasive digital connectivity, cybersecurity concerns have escalated. The rapid evolution of technology has led to a spectrum of cyber threats, including sophisticated zero-day attacks. This research addresses the challenge of existing intrusion detection systems in identifying zero-day attacks using the CIC-MalMem-2022 dataset and autoencoders for anomaly detection. The trained autoencoder is integrated with XGBoost and Random Forest, resulting in the models XGBoost-AE and Random Forest-AE. The study demonstrates that incorporating an anomaly detector into traditional models significantly enhances performance. The Random Forest-AE model achieved 100% accuracy, precision, recall, F1 score, and Matthews Correlation Coefficient (MCC), outperforming the methods proposed by Balasubramanian et al., Khan, Mezina et al., Smith et al., and Dener et al. When tested on unseen data, the Random Forest-AE model achieved an accuracy of 99.9892%, precision of 100%, recall of 99.9803%, F1 score of 99.9901%, and MCC of 99.8313%. This research highlights the effectiveness of the proposed model in maintaining high accuracy even with previously unseen data.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Dai et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje