Resources and benchmark corpora for hate speech detection: a systematic review
Autor: | Valerio Basile, Manuela Sanguinetti, Fabio Poletto, Cristina Bosco, Viviana Patti |
---|---|
Rok vydání: | 2020 |
Předmět: |
Linguistics and Language
Voice activity detection Computer science 02 engineering and technology Library and Information Sciences Data science Language and Linguistics Education Focus (linguistics) Hate speech detection Benchmark corpora 020204 information systems Systematic review 0202 electrical engineering electronic engineering information engineering Benchmark (computing) Key (cryptography) Natural Language Processing shared tasks 020201 artificial intelligence & image processing Social media Hate speech detection Benchmark corpora Natural Language Processing shared tasks Systematic review Computational linguistics |
Zdroj: | Language Resources and Evaluation. 55:477-523 |
ISSN: | 1574-0218 1574-020X |
DOI: | 10.1007/s10579-020-09502-8 |
Popis: | Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an important role as well for the development of hate speech detection systems. In this review, we systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors. The results of our analysis highlight a heterogeneous, growing landscape, marked by several issues and venues for improvement. |
Databáze: | OpenAIRE |
Externí odkaz: |