Bayesian Hyper-Parameter Optimisation for Malware Detection
Autor: | Fahad T. ALGorain, John A. Clark |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: |
Computer Networks and Communications
Hardware and Architecture Control and Systems Engineering hyper-parameter optimisation automated machine learning static malware detection tree parzen estimators bayesian optimisation random search grid search Signal Processing Electrical and Electronic Engineering |
Zdroj: | Electronics; Volume 11; Issue 10; Pages: 1640 |
ISSN: | 2079-9292 |
DOI: | 10.3390/electronics11101640 |
Popis: | Malware detection is a major security concern and has been the subject of a great deal of research and development. Machine learning is a natural technology for addressing malware detection, and many researchers have investigated its use. However, the performance of machine learning algorithms often depends significantly on parametric choices, so the question arises as to what parameter choices are optimal. In this paper, we investigate how best to tune the parameters of machine learning algorithms—a process generally known as hyper-parameter optimisation—in the context of malware detection. We examine the effects of some simple (model-free) ways of parameter tuning together with a state-of-the-art Bayesian model-building approach. Our work is carried out using Ember, a major published malware benchmark dataset of Windows Portable Execution metadata samples, and a smaller dataset from kaggle.com (also comprising Windows Portable Execution metadata). We demonstrate that optimal parameter choices may differ significantly from default choices and argue that hyper-parameter optimisation should be adopted as a ‘formal outer loop’ in the research and development of malware detection systems. We also argue that doing so is essential for the development of the discipline since it facilitates a fair comparison of competing machine learning algorithms applied to the malware detection problem. |
Databáze: | OpenAIRE |
Externí odkaz: |