Large-Scale Malicious Software Classification With Fuzzified Features and Boosted Fuzzy Random Forest

Autor:	Alan Wee-Chung Liew, Shilin Wang, Weiping Ding, Fang-Qi Li, Gongshen Liu
Rok vydání:	2021
Předmět:	Boosting (machine learning) Computer science business.industry Applied Mathematics Fuzzy set Decision tree computer.software_genre Machine learning Fuzzy logic Random forest Support vector machine ComputingMethodologies_PATTERNRECOGNITION Computational Theory and Mathematics Artificial Intelligence Control and Systems Engineering Malware Artificial intelligence Malware analysis business computer
Zdroj:	IEEE Transactions on Fuzzy Systems. 29:3205-3218
ISSN:	1941-0034 1063-6706
Popis:	Classification of malicious software, especially in a very large dataset, is a challenging task for machine intelligence. Malware can have highly diversified features, each of which has highly heterogeneous distributions. These factors increase the difficulties for traditional data analytic approaches to deal with them. Although deep learning based methods have reported good classification performance, the deep models usually lack interpretability and are fragile under adversarial attacks. To solve these problems, fuzzy systems have become a competitive candidate in malware analysis. In this article, a new fuzzy-based approach is proposed for malware classification. We focused on portable executable files in the Windows platform and analyzed the distributions of static features and content-oriented features. Fuzzification was used to reduce the ubiquitous impact of noise and outliers in a very large dataset. Finally, a novel boosted classifier consisted of fuzzy decision trees and support vector machine is proposed to perform the malware classification. By using fuzzy decision trees, the inner structure of the classifier can be readily interpreted as discriminative rules, whereas the novel boosting strategy provides state-of-the-art classification performance. Extensive experimental results showed that our method significantly outperformed several state-of-the-art classifiers.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e8e64972c6b3cf1d1585b39b2ad7b953 https://doi.org/10.1109/tfuzz.2020.3016023 Zobrazit plný text záznamu