A Comparative Analysis of Distributional Term Representations for Author Profiling in Social Media

Autor: Álvarez-Carmona, Miguel Á., Villatoro-Tello, Esaú, Montes-y-Gómez, Manuel, Villaseñor-Pienda, Luis
Rok vydání: 2019
Předmět:
Zdroj: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4857-4868, 2019
Druh dokumentu: Working Paper
DOI: 10.3233/JIFS-179033
Popis: Author Profiling (AP) aims at predicting specific characteristics from a group of authors by analyzing their written documents. Many research has been focused on determining suitable features for modeling writing patterns from authors. Reported results indicate that content-based features continue to be the most relevant and discriminant features for solving this task. Thus, in this paper, we present a thorough analysis regarding the appropriateness of different distributional term representations (DTR) for the AP task. In this regard, we introduce a novel framework for supervised AP using these representations and, supported on it. We approach a comparative analysis of representations such as DOR, TCOR, SSR, and word2vec in the AP problem. We also compare the performance of the DTRs against classic approaches including popular topic-based methods. The obtained results indicate that DTRs are suitable for solving the AP task in social media domains as they achieve competitive results while providing meaningful interpretability.
Databáze: arXiv