Abusive text detection using neural networks
Autor: | Chen, H., susan mckeever, Delany, S. J. |
---|---|
Předmět: | |
Zdroj: | Scopus-Elsevier Conference papers Articles |
Popis: | eural network models have become increasingly popular for text classification in recent years. In particular, the emergence of word embeddings within deep learning architectures has recently attracted a high level of attention amongst researchers. In this paper, we focus on how neural network models have been applied in text classification. Secondly, we extend our previous work [4, 3] using a neural network strategy for the task of abusive text detection. We compare word embedding features to the traditional feature representations such as n-grams and handcrafted features. In addition, we use an off-the-shelf neural network classifier, FastText[16]. Based on our results, the conclusions are: (1) Extracting selected manual features can increase abusive content detection over using basic ngrams; (2) Although averaging pre-trained word embeddings is a naive method, the distributed feature representation has better performance to ngrams in most of our datasets; (3) While the FastText classifier works efficiently with fast performance, the results are not remarkable as it is a shallow neural network with only one hidden layer; (4) Using pre-trained word embeddings does not guarantee better performance in the FastText classifier |
Databáze: | OpenAIRE |
Externí odkaz: |