Evaluating the Effect of Compression Settings in the Classification of Image File Formats

Autor: Mehdi Teimouri, Zahra Seyedghorban
Rok vydání: 2020
Předmět:
Zdroj: 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE).
DOI: 10.1109/iccke50421.2020.9303655
Popis: The classification of file fragments of various file formats is an important task in many applications such as intrusion detection systems, web content filtering, and digital forensics. To date, many research works have presented various feature sets and methods for the task of file fragments classification. Despite this variety, no research work has mainly focused on image file formats in particular. In this paper, the classification of the image file formats is studied. Moreover, we examine the effect of different compression settings on the accuracy of a trained model. It is shown that when during the training phase only specific compression settings are considered, the trained machine performs poorly for unseen compression settings. Considering this fact, we propose our method, in which, fragments with different compression settings but the same file format are merged to form a more general class label. We compare our approach with three other methods proposed in the literature. Results indicate that the proposed feature set leads to a more accurate classifier.
Databáze: OpenAIRE