SFCWGAN-BiTCN with Sequential Features for Malware Detection

Autor:	Bona Xuan, Jin Li, Yafei Song
Jazyk:	angličtina
Rok vydání:	2023
Předmět:	malware classification selection feature conditional Wasserstein generative adversarial network bidirectional temporal convolutional network whale optimization algorithm extreme gradient boosting Technology Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999
Zdroj:	Applied Sciences, Vol 13, Iss 4, p 2079 (2023)
Druh dokumentu:	article
ISSN:	2076-3417
DOI:	10.3390/app13042079
Popis:	In the field of adversarial attacks, the generative adversarial network (GAN) has shown better performance. There have been few studies applying it to malware sample supplementation, due to the complexity of handling discrete data. More importantly, unbalanced malware family samples interfere with the analytical power of malware detection models and mislead malware classification. To address the problem of the impact of malware family imbalance on accuracy, a selection feature conditional Wasserstein generative adversarial network (SFCWGAN) and bidirectional temporal convolutional network (BiTCN) are proposed. First, we extract the features of malware Opcode and API sequences and use Word2Vec to represent features, emphasizing the semantic logic between API tuning and Opcode calling sequences. Second, the Spearman correlation coefficient and the whale optimization algorithm extreme gradient boosting (WOA-XGBoost) algorithm are combined to select features, filter out invalid features, and simplify structure. Finally, we propose a GAN-based sequence feature generation algorithm. Samples were generated using the conditional Wasserstein generative adversarial network (CWGAN) on the imbalanced malware family dataset, added to the trainset to supplement the samples, and trained on BiTCN. In comparison, in tests on the Kaggle and DataCon datasets, the model achieved detection accuracies of 99.56% and 96.93%, respectively, which were 0.18% and 2.98% higher than the models of other methods.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/244ecc0f68134881a9a4407a0003ff66 Zobrazit plný text záznamu View record in DOAJ