Abusive text detection using neural networks

Autor:	Chen, H., susan mckeever, Delany, S. J.
Předmět:	abusive machine learning feature selection classification Computer Sciences detection abusive detection neural networks labelling strategy text
Zdroj:	Scopus-Elsevier Conference papers Articles
Popis:	eural network models have become increasingly popular for text classification in recent years. In particular, the emergence of word embeddings within deep learning architectures has recently attracted a high level of attention amongst researchers. In this paper, we focus on how neural network models have been applied in text classification. Secondly, we extend our previous work [4, 3] using a neural network strategy for the task of abusive text detection. We compare word embedding features to the traditional feature representations such as n-grams and handcrafted features. In addition, we use an off-the-shelf neural network classifier, FastText[16]. Based on our results, the conclusions are: (1) Extracting selected manual features can increase abusive content detection over using basic ngrams; (2) Although averaging pre-trained word embeddings is a naive method, the distributed feature representation has better performance to ngrams in most of our datasets; (3) While the FastText classifier works efficiently with fast performance, the results are not remarkable as it is a shallow neural network with only one hidden layer; (4) Using pre-trained word embeddings does not guarantee better performance in the FastText classifier
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::57d9c38e384c050be4829722491cee7e http://www.scopus.com/inward/record.url?eid=2-s2.0-85046033767&partnerID=MN8TOARS Zobrazit plný text záznamu