Automatic Classification of Tweets for Analyzing Communication Behavior of Museums

Autor: Nicolas Foucault, Antoine Courtin
Přispěvatelé: Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Université Paris Saclay (COmUE)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université - UFR d'Ingénierie (UFR 919), Sorbonne Université (SU)-Sorbonne Université (SU)-Université Paris-Saclay-Université Paris-Sud - Paris 11 (UP11), Institut National d'Histoire de l'Art (INHA), INHA
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 2016, Portorož, Slovenia
Scopus-Elsevier
HAL
Popis: International audience; In this paper, we present a study on tweet classification which aims to define the communication behavior of the 103 French museums that participated in 2014 in the Twitter operation: MuseumWeek. The tweets were automatically classified in four communication categories: sharing experience, promoting participation, interacting with the community, and promoting-informing about the institution. Our classification is multi-class. It combines Support Vector Machines and Naive Bayes methods and is supported by a selection of eighteen subtypes of features of four different kinds: metadata information, punctuation marks, tweet-specific and lexical features. It was tested against a corpus of 1,095 tweets manually annotated by two experts in Natural Language Processing and Information Communication and twelve Community Managers of French museums. We obtained an state-of-the-art result of F1-score of 72% by 10-fold cross-validation. This result is very encouraging since is even better than some state-of-the-art results found in the tweet classification literature.
Databáze: OpenAIRE