Assessing the Level of Stability of Idiolectal Features across Modes, Topics and Time of Text Production
Autor: | Tatiana Litvinova, Olga Litvinova, Pavel Seredin |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: | |
Zdroj: | Proceedings of the XXth Conference of Open Innovations Association FRUCT, Vol 602, Iss 23, Pp 223-230 (2018) |
Druh dokumentu: | article |
ISSN: | 2305-7254 2343-0737 |
Popis: | Authorship attribution, i.e. task of revealing the author of a disputed text, is one of challenging issues facing digital forensics. Cross-domain authorship attribution when training and test texts differ in genres, topics and even modes (written/oral) is the most realistic, yet the most difficult scenario. All authorship attribution studies rely on the notion of an idiolect, which is a set of stable features, despite the fact that there are few studies exploring the stability of idiolectal features. The aim of the paper is to reveal the effect of mode, topics and time of text production on the stability of idiolectal features across a series of experiments. Our pilot study revealed that a mode change (written/oral) causes the most striking differences in text parameters in comparison to a topic and time of production although some features (namely, relative frequencies of certain discourse markers) remain relatively stable in all experimental setups. We conclude that the corpus containing diverse types of texts from each individual is needed for thoroughly examining the stability of idiolectal features and developing cross-domain attribution techniques to be employed in realistic scenarios. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |