Improving Classification of Metamorphic Malware by Augmenting Training Data with a Diverse Set of Evolved Mutant Samples
Autor: | Emma Hart, Kehinde O. Babaagba, Zhiyuan Tan |
---|---|
Rok vydání: | 2020 |
Předmět: |
Training set
Computer science business.industry Feature extraction 02 engineering and technology computer.software_genre Machine learning Support vector machine 020204 information systems 0202 electrical engineering electronic engineering information engineering Malware 020201 artificial intelligence & image processing Artificial intelligence business computer Classifier (UML) Metamorphic malware |
Zdroj: | CEC |
Popis: | Detecting metamorphic malware provides a challenge to machine-learning models as trained models might not generalise to future mutant variants of the malware. To address this, we explore whether machine-learning models can be improved by augmenting training data-sets with samples of potential variants. These variants are generated using an evolutionary algorithm that evolves a behaviourally diverse set of mutants, optimised to avoid detection by a large set of existing detection-engines. Using features calculated from the behavioural trace of a sample as input, we evaluate the ability of five machine-learning methods to detect the new variants, show that the detection rate is considerably improved by including the new samples as training data, and that the classifiers still generalise over a range of malware. We then repeat this experiment using a sequence-based deep-learning method as the classifier, which is shown to out-perform the feature-based classifiers. |
Databáze: | OpenAIRE |
Externí odkaz: |