Performance Evaluation of Filter-based Feature Selection Techniques in Classifying Portable Executable Files
Autor: | C. D. Jaidhar, S. L. Shiva Darshan |
---|---|
Rok vydání: | 2018 |
Předmět: |
021110 strategic
defence & security studies Computer science business.industry Feature vector Search engine indexing 0211 other engineering and technologies Pattern recognition Feature selection 02 engineering and technology Mutual information computer.file_format 0202 electrical engineering electronic engineering information engineering General Earth and Planetary Sciences 020201 artificial intelligence & image processing Artificial intelligence business Categorical variable computer Classifier (UML) General Environmental Science Portable Executable Curse of dimensionality |
Zdroj: | Procedia Computer Science. 125:346-356 |
ISSN: | 1877-0509 |
Popis: | The dimensionality of the feature space exhibits a significant effect on the processing time and predictive performance of the Malware Detection Systems (MDS). Therefore, the selection of relevant features is crucial for the classification process. Feature Selection Technique (FST) is a prominent solution that effectively reduces the dimensionality of the feature space by identifying and neglecting noisy or irrelevant features from the original feature space. The significant features recommended by FST uplift the malware detection rate. This paper provides the performance analysis of four chosen filter-based FSTs and their impact on the classifier decision. FSTs such as Distinguishing Feature Selector (DFS), Mutual Information (MI), Categorical Proportional Difference (CPD), and Darmstadt Indexing Approach (DIA) have been used in this work and their efficiency has been evaluated using different datasets, various feature-length, classifiers, and success measures. The experimental results explicitly indicate that DFS and MI offer a competitive performance in terms of better detection accuracy and that the efficiency of the classifiers does not decline on both the balanced and unbalanced datasets. |
Databáze: | OpenAIRE |
Externí odkaz: |