Towards Near-Real-Time Intrusion Detection for IoT Devices using Supervised Learning and Apache Spark

Autor:	Salvatore Rampone, Valerio Morfino
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	iot Computer Networks and Communications Computer science Real-time computing syn-dos lcsh:TK7800-8360 Cloud computing 02 engineering and technology Intrusion detection system cloud environment Spark (mathematics) 0202 electrical engineering electronic engineering information engineering supervised machine learning Electrical and Electronic Engineering Training set business.industry Supervised learning lcsh:Electronics 020206 networking & telecommunications Random forest Elasticity (cloud computing) Hardware and Architecture Control and Systems Engineering Signal Processing 020201 artificial intelligence & image processing Anomaly detection cyber-attacks apache spark business mllib hybrid approach
Zdroj:	Electronics, Vol 9, Iss 3, p 444 (2020) Electronics Volume 9 Issue 3
ISSN:	2079-9292
Popis:	In the fields of Internet of Things (IoT) infrastructures, attack and anomaly detection are rising concerns. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growing proportionally. In this paper the performances of several machine learning algorithms in identifying cyber-attacks (namely SYN-DOS attacks) to IoT systems are compared both in terms of application performances, and in training/application times. We use supervised machine learning algorithms included in the MLlib library of Apache Spark, a fast and general engine for big data processing. We show the implementation details and the performance of those algorithms on public datasets using a training set of up to 2 million instances. We adopt a Cloud environment, emphasizing the importance of the scalability and of the elasticity of use. Results show that all the Spark algorithms used result in a very good identification accuracy (> 99%). Overall, one of them, Random Forest, achieves an accuracy of 1. We also report a very short training time (23.22 sec for Decision Tree with 2 million rows). The experiments also show a very low application time (0.13 sec for over than 600,000 instances for Random Forest) using Apache Spark in the Cloud. Furthermore, the explicit model generated by Random Forest is very easy-to-implement using high- or low-level programming languages. In light of the results obtained, both in terms of computation times and identification performance, a hybrid approach for the detection of SYN-DOS cyber-attacks on IoT devices is proposed: the application of an explicit Random Forest model, implemented directly on the IoT device, along with a second level analysis (training) performed in the Cloud.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::57afa70508c4626a80534086513d6736 https://www.mdpi.com/2079-9292/9/3/444 Zobrazit plný text záznamu