Attentive recurrent text categorisation models using word-category cross embedding on financial comments

Autor: Siegfried Handschuh, André Freitas, Markus Endres, Vivian Dos Santos Silva, Macedo Maia
Rok vydání: 2020
Předmět:
Zdroj: SAC
DOI: 10.1145/3341105.3374097
Popis: User-oriented comments in web forums carry rich information that helps people to find answers about a topic or subject. However, real-world data often contain redundant or irrelevant content, which leads to the introduction of noise in text classification tasks. In this paper, we propose an attentive recurrent neural network model for financial text categorisation based on a cross-embedding input representation. Our cross-embedding approach combines contextual information from words in a comment and contextual information from labels in a category set. Besides the proposed model, we also built an annotated real-world dataset containing complex comments posted by different users on a financial website. Experimental results compare our approach with the state-of-the-art baselines based on Recurrent Neural Network (RNN) extensions. The best configuration shows an increase of almost 12% in precision and 8% in F1-score when compared to the second best-performing baselines.
Databáze: OpenAIRE