A Model Based on LSTM Neural Networks to Identify Five Different Types of Malware

Autor: Eduardo de O. Andrade, José Viterbo, Joris Guérin, Cristina Nader Vasconcelos, Flavia Bernardini
Rok vydání: 2019
Předmět:
Zdroj: KES
ISSN: 1877-0509
DOI: 10.1016/j.procs.2019.09.173
Popis: Identifying malware has always been a great challenge. Much money and time has been invested by companies and governments to mitigate the impact of these threats. Nowadays, with the increasing amount of data available, it is possible to use more precise classification techniques. However, most large datasets that include malicious and non-malicious softwares are not public, which hinders the quest for solutions based in technologies that rely on the availability of large amounts of data, such as deep learning. To overcome this limitation, this article introduces a new large dataset for malware classification, which was made publicly available. We then propose a model to train a multiclass classification recurrent neural network (RNN), more specifically a long short-term memory neural network (LSTM) on our dataset. This model for analyzing unstructured malware data is then tested on unseen programs and the accuracy obtained reaches 67.60%, including six classes with five different types of malware.
Databáze: OpenAIRE