Impact of Dataset Representation on Smartphone Malware Detection Performance
Autor: | Jean-Marc Robert, Chamseddine Talhi, Abdelfattah Amamra |
---|---|
Rok vydání: | 2013 |
Předmět: |
Potential impact
Computer science business.industry Intrusion detection system computer.software_genre Machine learning ComputingMethodologies_PATTERNRECOGNITION System call Computation complexity Malware Detection performance Data mining Artificial intelligence Android (operating system) business computer Classifier (UML) |
Zdroj: | Trust Management VII ISBN: 9783642383229 IFIPTM |
DOI: | 10.1007/978-3-642-38323-6_12 |
Popis: | Improving Smartphone anomaly-based malware detection techniques is widely studied in recent years. Previous studies explore three factors: dataset size, dataset type and normal profile model. These factors improve the performance, but increase computation complexity and the required memory space. In this paper we explore a new factor: the dataset representation. Dataset representation is the format adopted to organize and represent data. To investigate the impact of this factor, we examine four machine learning classifiers with three different dataset representations. Those dataset representations are: successive system calls, bag of system calls and patterns frequency system calls. The used dataset is a collection of system call traces of Smartphone executing Android 2.2. We analyse the performance of each classifier and deduce the influence of dataset representation on accuracy and false positive rates. The results show that the dataset representation has a potential impact on the performance of classifiers with low computational and memory cost. |
Databáze: | OpenAIRE |
Externí odkaz: |