Multi-platform authorship verification

Autor: Abdulaziz Altamimi, Steven Furnell, Fudong Li, Nathan Clarke
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: Altamimi, A, Clarke, N, Furnell, S & Li, F 2019, Multi-platform authorship verification . in CECC 2019: Proceedings of the Third Central European Cybersecurity Conference ., 13, Association for Computing Machinery (ACM), New York, NY, United States, CECC 2019, Munich, Germany, 14/11/19 . https://doi.org/10.1145/3360664.3360677
CECC
Popis: At the present time, there has been a rapid increase in the variety and popularity of messaging systems such as social network messaging, text messages, email and Twitter, with users frequently exchanging messages across various platforms. Unfortunately, in amongst the legitimate messages, there is a host of illegitimate and inappropriate content - with cyber stalking, trolling and computerassisted crime all taking place. Therefore, there is a need to identify individuals using messaging systems. Stylometry is the study of linguistic features in a text which consists of verifying an author based on his writing style that consists of checking whether a target text was written or not by a specific individual author. Whilst much research has taken place within authorship verification, studies have focused upon singular platforms, often had limited datasets and restricted methodologies that have meant it is difficult to appreciate the real-world value of the approach. This paper seeks to overcome these limitations through providing an analysis of authorship verification across four common messaging systems. This approach enables a direct comparison of recognition performance and provides a basis for analyzing the feature vectors across platforms to better understand what aspects each capitalize upon in order to achieve good classification. The experiments also include an investigation into the feature vector creation, utilizing population and user-based techniques to compare and contrast performance. The experiment involved 50 participants across four common platforms with a total 13,617; 106,359; 4,539; and 6,540 samples for Twitter, SMS, Facebook, and Email achieving an Equal Error Rate (EER) of 20.16%, 7.97%, 25% and 13.11% respectively.
Databáze: OpenAIRE