Sequence-based Anomaly Detection for Analyzing and Identifying Malicious Network Behavior
Autor: | Ching-Hao Mao, 毛敬豪 |
---|---|
Rok vydání: | 2010 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 99 Malicious behaviors with similar intent or purpose often possess different data sequence patterns which increase the difficulty in malicious behavior identification. These data sequence variations originated from the following three sources: (1) the multiplicity of causal relationships appearing in data sequence patterns, (2) the injection of noises in the attack sequence, and (3) the interwoven of various malicious behaviors. These problems make the current intrusion detection systems perform poorly in capturing the causal relation of malicious behaviors, or precisely correlating their relationship and correctly reconstructing the attack scenario, and consequently producing considerable false alarms. Furthermore, these problems caused the analysis of the abnormal events flagged by the intrusion detection system heavily dependent on the security experts and analytical man-power. In this thesis, we propose a Sequence-based Anomaly Detection (SBAD) mechanism for identifying malicious behaviors with causal relationship. In this mechanism, we first proposed a behavior thread extraction mechanism based on host topologies which could automatically extract behavior sequence without any parameter configuration and prior knowledge. Then we described each behavior sequence using a probabilistic graph-based model and the “n-step-window-based bi-gram” correlation method, which could tolerate large variances, resulted from different data sequences manifested by different malicious attacks, event interleaving and long interjections. Using these models, we proposed a novel “sequence dissimilarity measure” which leverages the concept of “minimum description length” encoding method to maximize the dissimilarity between different groups of events. A large dissimilarity measure implies different behaviors between the two groups of events, while a small dissimilarity measure implies the opposite meaning. Finally, the “manifold learning analytical method” is applied to the extracting of the meaningful features from a large quantity of features via visualized interfaces which combined either with regular classification methods or with active learning methods to identify suspicious or malicious behavior sequences. The proposed SBAD has made the following contributions: (1) A proposal for a behavior thread extraction mechanism without any parameter configuration and a priori knowledge; (2) A novel dissimilarity measure for event sequences. Under this measure, not just sequence to sequence, but also sequence to template, or template to template can be compared to each other using a dissimilarity measure; (3) Justified by real case study, an event analyzer which can effectively deal with event sequences in presence of noise, interleaving, interjection (long insertion) or interception contained sequences, sequences with missing alerts, etc; (5) A visualization mechanism (Isomap) which facilitates the visual understanding of the distribution between malicious or normal behavior of event sequences in low dimension space. We evaluated our proposed methods using different sources of dataset (e.g., public data, semi-public data and real world private data). The evaluation results show that the proposed SBAD method could identify various malicious sequence patterns under abundant noisy environment. Moreover, through the graphical structures and the representation from the “manifold learning method” the visualized result could be provided naturally for the further study or analysis by either security experts or operation personnel. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |