WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

Autor:	Nguyen, Dat Quoc, Vu, Thanh, Rahimi, Afshin, Dao, Mai Hoang, Nguyen, Linh The, Doan, Long
Rok vydání:	2020
Předmět:	Computer Science - Computation and Language
Druh dokumentu:	Working Paper
Popis:	In this paper, we provide an overview of the WNUT-2020 shared task on the identification of informative COVID-19 English Tweets. We describe how we construct a corpus of 10K Tweets and organize the development and evaluation phases for this task. In addition, we also present a brief summary of results obtained from the final system evaluation submissions of 55 teams, finding that (i) many systems obtain very high performance, up to 0.91 F1 score, (ii) the majority of the submissions achieve substantially higher results than the baseline fastText (Joulin et al., 2017), and (iii) fine-tuning pre-trained language models on relevant language data followed by supervised training performs well in this task. Comment: In Proceedings of the 6th Workshop on Noisy User-generated Text
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2010.08232 Zobrazit plný text záznamu View this record from Arxiv