Graph embeddings for Abusive Language Detection

Autor:	Noé Cecillon, Georges Linarès, Vincent Labatut, Richard Dufour
Přispěvatelé:	Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Theoretical computer science General Computer Science Language identification Computer Networks and Communications Computer science Process (engineering) Graph embedding Automatic abuse detection 02 engineering and technology Social networks [INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI] [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Conversational graph Artificial Intelligence 020204 information systems 0202 electrical engineering electronic engineering information engineering Set (psychology) Representation (mathematics) Structure (mathematical logic) Social and Information Networks (cs.SI) Online conversations Node (networking) Computer Science - Social and Information Networks Computer Graphics and Computer-Aided Design Graph Computer Science Applications Computational Theory and Mathematics 020201 artificial intelligence & image processing
Zdroj:	SN Computer Science SN Computer Science, Springer, 2021, 2, pp.37. ⟨10.1007/s42979-020-00413-7⟩
ISSN:	2662-995X 2661-8907
DOI:	10.48550/arxiv.2101.02988
Popis:	International audience; Abusive behaviors are common on online social networks. The increasing frequency of antisocial behaviors forces the hosts of online platforms to find new solutions to address this problem. Automating the moderation process has thus received a lot of interest in the past few years. Various methods have been proposed, most based on the exchanged content, and one relying on the structure and dynamics of the conversation. It has the advantage of being languageindependent, however it leverages a hand-crafted set of topological measures which are computationally expensive and not necessarily suitable to all situations. In the present paper, we propose to use recent graph embedding approaches to automatically learn representations of conversational graphs depicting message exchanges. We compare two categories: node vs. whole-graph embeddings. We experiment with a total of 8 approaches and apply them to a dataset of online messages. We also study more precisely which aspects of the graph structure are leveraged by each approach. Our study shows that the representation produced by certain embeddings captures the information conveyed by specific topological measures, but misses out other aspects.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b1e7f1c83d38f6cb4358333bd028a487 Zobrazit plný text záznamu Full text from SpringerLink