Identifying Topical Shifts in Twitter Streams: An Integration of Non-negative Matrix Factorisation, Sentiment Analysis and Structural Break Models for Large Scale Data

Autor: Christoph Weisser, Benjamin Säfken, Krisztina Kis-Katos, Thomas Kneib, Alexander Silbersdorff, Mattias Luber
Rok vydání: 2021
Předmět:
Zdroj: Disinformation in Open Online Media ISBN: 9783030870300
MISDOOM
DOI: 10.1007/978-3-030-87031-7_3
Popis: We propose an integration of Non-negative Matrix Factorisation, Sentiment analysis and Structural Break Models to identify significant topical shifts on the social media platform Twitter. For the topic modelling, we compare Latent Dirichlet Allocation and Non-negative Matrix Factorization in terms of their applicability to short text documents. The extraction of sentiment is done by the rule-based VADER model. Structural breaks in the relative frequency and daily sentiments of topics over time are identified with the Bai-Perron model. Combining these methods, we provide a valuable and easy to use exploratory tool for social scientists to study the discourse on Twitter over time. Detecting statistically significant shifts in topics over time enables researchers to perform statistical inference and test hypotheses about the discourse on Twitter. The framework is implemented efficiently to ensure that it can be used on average consumer hardware in a reasonable amount of time. A case study with COVID-19 related tweets in the UK is provided. Our method is validated by linking the topical shifts to real world events by the use of the timestamps of the COVID-19 related tweets.
Databáze: OpenAIRE