Estimating Aggressiveness of Russian Texts by Means of Machine Learning

Autor:	Dmitrii Malov, Dmitriy Levonevskiy, Irina Vatamaniuk
Rok vydání:	2019
Předmět:	Artificial neural network Computer science business.industry Sentiment analysis Emotion detection 0102 computer and information sciences 02 engineering and technology Python (programming language) Machine learning computer.software_genre 01 natural sciences Text processing 010201 computation theory & mathematics Assessment methods 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Software system Artificial intelligence business computer computer.programming_language
Zdroj:	Speech and Computer ISBN: 9783030260606 SPECOM
DOI:	10.1007/978-3-030-26061-3_28
Popis:	This paper considers emotional assessment of texts in Russian using machine learning on the example of aggression detection. It summarizes the related work, methods, models and datasets, describes actual problems, proposes a text processing pipeline and a software system for training neural networks on heterogeneous datasets. The experiments show that neural networks trained on the annotated corpora both in Russian and English, allow to determine whether a text item in Russian contains an aggressive message. Authors thoroughly compare different assessment methods, particularly corpus-based approaches, machine learning solutions and hybrid variants. Results, obtained here, can be used to estimate the aggressiveness probability, for example, to rank messages for subsequent manual verification. They also enable feasibility studies on the possibilities of detecting a particular type of emotion in a text using corpora in other languages. The paper highlights further research directions, where different Python toolkits (NLTK, Keras) could be used for better model performance.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::0ae188a1dcb5450872770d62ecbb44aa https://doi.org/10.1007/978-3-030-26061-3_28 Zobrazit plný text záznamu