Semantic malware classification using convolutional neural networks

Autor: Eliel Martins, Ricardo Santana, Javier Bermejo Higuera, Juan Ramón Bermejo Higuera, Juan Antonio Sicilia Montalvo
Rok vydání: 2022
DOI: 10.21203/rs.3.rs-2040455/v1
Popis: This paper addresses malware classification into families using static analysis and a convolutional neural network through raw bytes. Previous research indicates that machine learning is an interesting approach to malware classification. The neural network used was based on the proposed Malconv, a convolutional neural network used for malware classification by training the network with the whole binary. Minor modifications were made to get better results and apply them to a multi-classification problem. Four models were trained with data extracted from Portable Executable malware samples labeled into nine families.These data were extracted in two ways: according to the semantic variation of bytes and using the entire file. The trained models were used for testing to check generality. The results from these four proposed models were compared and analyzed against models trained according to similar research. We concluded that the header is the most important part of a PE for malware identification purposes.
Databáze: OpenAIRE