Confidential Data Identification Using Text Summarization Technique in Data Leakage Prevention System

Autor: WU, WEI-ZI, 吳威志
Rok vydání: 2018
Druh dokumentu: 學位論文 ; thesis
Popis: 106
Data Leakage Prevention (DLP) as a key element for intelligent property protection techniques is expected to provide privacy preservation benefits for multiple stakeholders. One of the key factors that will determine the success of DLP is confidential document classification, which deals with identifying sensitive information that is critical to stakeholders. However, with the support of traditional DLP (either based on features or statistics), the manager is not able accurately to identify the confidential documents due to variant attacks (such as rephrase or embedded in confidential documents ). In this study, we propose Gemini methods, an automatic and intellignet DLP system. Gemini is the first such system that removes irrelevant document contents and reserves key points by summary techniques, measures the rest of key features, and evaluates the documents’s categories. The applicability of our approach was demonstrated on two different datasets and three possible scenarios from real-world. Extensive experiments have shown that Gemini is superior to other methods in classifying confidential documents, where the confidential documents are different from the original texts at the training phase.
Databáze: Networked Digital Library of Theses & Dissertations