File classification using byte sub-stream kernels

Autor: Olivier de Vel
Rok vydání: 2004
Předmět:
Zdroj: Digital Investigation. 1:150-157
ISSN: 1742-2876
Popis: The ability to automatically classify files based on their low-level, short-range structures is of particular importance in computer forensics. We report a study on the automatic learning of file classification using byte sub-stream kernels that capture these low-level structures. We automatically discover byte-level patterns in a file by extracting a byte sequence feature map and use a suffix trie data structure to efficiently store and manipulate the feature map. Using the feature map we compute the spectrum kernel and, together with a support vector machine classifier algorithm, we are able to efficiently categorize a variety of different system and application file types. Experiments have provided good file classification performance results.
Databáze: OpenAIRE