Unicode search of dirty data, or: How I learned to stop worrying and love Unicode Technical Standard #18

Autor: Jon Stewart, Joel Uckelman
Rok vydání: 2013
Předmět:
Zdroj: Digital Investigation. 10:S116-S125
ISSN: 1742-2876
DOI: 10.1016/j.diin.2013.06.013
Popis: This paper discusses problems arising in digital forensics with regard to Unicode, character encodings, and search. It describes how multipattern search can handle the different text encodings encountered in digital forensics and a number of issues pertaining to proper handling of Unicode in search patterns. Finally, we demonstrate the feasibility of the approach and discuss the integration of our developed search engine, lightgrep, with the popular bulk_extractor tool.
Databáze: OpenAIRE