Introducing the Gab Hate Corpus: Defining and applying hate-based rhetoric to social media posts at scale

Autor: Mohammad Atari, Gwenyth Portillo-Wightman, Kim Y, Park C, Joe Hoover, Brendan F. Kennedy, Shreya Havaldar, Leigh Yeh, Wang C, Morteza Dehghani, Hussain A, Wang X, Coombs K, Aida Mostafazadeh Davani, olmos g, Lara A, Elaine Gonzalez, Ali Omrani, Omary A, Zhang Y, Azatian A
Rok vydání: 2018
Předmět:
Popis: We present the Gab Hate Corpus (GHC), consisting of 27,665 posts from the social network service gab.com, each annotated for the presence of “hate-based rhetoric” by a minimum of three annotators. Posts were labeled according to a coding typology derived from a synthesis of hate speech definitions across legal precedent, previous hate speech coding typologies, and definitions from psychology and sociology, comprising hierarchical labels indicating dehumanizing and violent speech as well as indicators of targeted groups and rhetorical framing. We provide inter-annotator agreement statistics and perform a classification analysis in order to validate the corpus and establish performance baselines. The GHC complements existing hate speech datasets in its theoretical grounding and by providing a large, representative sample of richly annotated social media posts.
Databáze: OpenAIRE