Efficient intrusion detection using representative instances
Autor: | Zhong-Kun Zhang, Chun Guo, Yuan Ping, Yuping Lai, Shou-Shan Luo, Yajian Zhou |
---|---|
Rok vydání: | 2013 |
Předmět: |
Training set
General Computer Science Anomaly-based intrusion detection system Computer science business.industry Feature selection Intrusion detection system Machine learning computer.software_genre Data set Support vector machine Data pre-processing Data mining Artificial intelligence business Law computer Classifier (UML) |
Zdroj: | Computers & Security. 39:255-267 |
ISSN: | 0167-4048 |
DOI: | 10.1016/j.cose.2013.08.003 |
Popis: | Because of their feasibility and effectiveness, artificial intelligence-based intrusion detection systems attract considerable interest from researchers. However, when confronted with large-scale data sets, many artificial intelligence-based intrusion detection systems could suffer from a high computational burden, even though the feature selection method can help to reduce the computational complexity. To improve the efficiency, we propose a representative instance selection method to preprocess the original data set before training a classifier, which is independent of the learning algorithm that is used for constructing the intrusion detection system. In this study, a new metric is introduced to measure the representative power of an instance with respect to its class. Based on an implementation of representativeness, we select the most representative instance in each subset divided by a novel centroid-based partitioning strategy, and then, we utilise the result as training data to build various intrusion detection models efficiently. Experimental results on a labelled flow-based data set introduced in 2009 show that ANN, KNN, SVM and Liblinear learning with a largely reduced set of representative instances can not only achieve high efficiency in detecting network attacks but also provide comparable detection performance in terms of the detection rate, precision, F-score and accuracy, as compared with four corresponding classifiers built with the original large data set. |
Databáze: | OpenAIRE |
Externí odkaz: |