Authorship Obfuscation System Development based on Long Short-term Memory Algorithm

Autor: Hendrik Maulana, Riri Fitri Sari
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: International Journal of Technology, Vol 13, Iss 2, Pp 345-355 (2022)
Druh dokumentu: article
ISSN: 2086-9614
2087-2100
DOI: 10.14716/ijtech.v13i2.4257
Popis: Stylometry is an authorship analysis technique that uses statistics. Through stylometry, the authorship identity of a document can be analyzed with high accuracy. This poses a threat to the privacy of the author. Meanwhile, there is a stylometry method, namely the elimination of authorship identity, which can provide privacy protection for writers. This study uses the authorship method to eliminate the method applied to the Federalist Paper corpus. Federalist Paper is a well-known corpus that has been extensively studied, especially in authorship identification methods, considering that there are 12 disputed texts in the corpus. One identification method is the use of the support vector machine (SVM) algorithm. Through this algorithm, the author’s identity of disputed text can be obtained with 86% accuracy. The authorship identity elimination method can change the writing style while maintaining its meaning. Long-short-term memory (LSTM) is a deep learning-based algorithm that can predict words well. Through a model formed from the LSTM algorithm, the writing style of the disputed documents in the Federalist Paper can be changed. As a result, 4 out of 12 disputed documents can be changed from one author identity to another identity. The similarity level of the changed documents ranges from 40% to 57%, which indicates the meaning preservation from original documents. Our experimental results conclude that the proposed method can eliminate authorship identity well.
Databáze: Directory of Open Access Journals