Abusive text detection using neural networks

Autor: Chen, H., susan mckeever, Delany, S. J.
Předmět:
Zdroj: Scopus-Elsevier
Conference papers
Articles
Popis: eural network models have become increasingly popular for text classification in recent years. In particular, the emergence of word embeddings within deep learning architectures has recently attracted a high level of attention amongst researchers. In this paper, we focus on how neural network models have been applied in text classification. Secondly, we extend our previous work [4, 3] using a neural network strategy for the task of abusive text detection. We compare word embedding features to the traditional feature representations such as n-grams and handcrafted features. In addition, we use an off-the-shelf neural network classifier, FastText[16]. Based on our results, the conclusions are: (1) Extracting selected manual features can increase abusive content detection over using basic ngrams; (2) Although averaging pre-trained word embeddings is a naive method, the distributed feature representation has better performance to ngrams in most of our datasets; (3) While the FastText classifier works efficiently with fast performance, the results are not remarkable as it is a shallow neural network with only one hidden layer; (4) Using pre-trained word embeddings does not guarantee better performance in the FastText classifier
Databáze: OpenAIRE