Optimizing Feature Selection Method in Intrusion Detection System Using Thresholding.

Autor: Faizin, Muhammad Arif, Kurniasari, Dias Tri, Elqolby, Nazhifah, Putra, Muhammad Aidiel Rachman, Ahmad, Tohari
Předmět:
Zdroj: International Journal of Intelligent Engineering & Systems; 2024, Vol. 17 Issue 3, p214-226, 13p
Abstrakt: Information and communication technology is growing rapidly, making it the target of various attacks. The attacks can be in the form of data theft, phishing, and Denial of Service (DoS). There are many ways to handle attacks on communication networks, including developing an Intrusion Detection System (IDS) model. Research on IDS has developed a lot and focuses on certain things such as feature selection, dealing with data imbalance problems. Feature selection is essential to the IDS model because of the dataset's characteristics, which have many features. Besides, the number of features included in the classification can affect the detection performance of the IDS model. This research proposes an IDS combining mutual information with thresholding feature selection and XGBoost classification algorithm. Mutual information is used to measure the dependency between every input feature and the target features. After the amount of information is obtained with mutual information, thresholding is used to decide the best number of features in the classification process. Then, the data are classified using XGBoost selected features. The proposed method was tested using four metrics: accuracy, precision, recall, and f1-score. This study used UNSW-NB15 as the primary dataset to analyze the best combinations of feature selection method and thresholding value. In addition, the proposed method has also been tested using NSL-KDD and CIC-IDS2017 datasets to evaluate the performance compared with previous research. The proposed method performs best using the CIC-IDS2017 dataset with 99.89 % accuracy and 99.68 % F1 score. Furthermore, it can reduce computational training time compared with other IDS methods that only use feature selection or tree-model-based algorithms without thresholds. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index