Probabilistic Redaction

Autor: Joe Loughry
Rok vydání: 2016
Předmět:
Popis: An automated interactive redaction assistant prototype was developed. Based on the Web 1T 5-gram corpus, a list of every unique word and phrase in the English language, up to five words in length, that were observed on the World Wide Web and collected by Google, Inc. in 2006, the CLOAK system automatically flags candidate words, phrases, sentences, and paragraphs in documents under review that are likely classified and suggests redactions to make the document unclassified. Security classification guidance from more than one guide at a time is figured into each suggested redaction. The probabilistic aspect of operation is in the way the system prioritizes its suggestions according to the measured rate of occurrence of words and phrases observed
Databáze: OpenAIRE