What and With Whom? Identifying Topics in Twitter Through Both Interactions and Text
Autor: | Robertus Wahyu N. Nugroho, Cecile Paris, Surya Nepal, Weiliang Zhao, Jian Yang |
---|---|
Rok vydání: | 2020 |
Předmět: |
Information Systems and Management
Information retrieval Computer Networks and Communications Process (engineering) Computer science 020207 software engineering 02 engineering and technology Semantics Automatic summarization Computer Science Applications Variety (cybernetics) Matrix decomposition Hardware and Architecture Market analysis Similarity (psychology) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Sparse matrix |
Zdroj: | IEEE Transactions on Services Computing. 13:584-596 |
ISSN: | 2372-0204 |
Popis: | The overwhelming amount of information continuously flowing through the Twitter environment makes topic derivation essential. It indeed plays a valuable role in a variety of Twitter-based applications, including content recommendations, news summarization, market analysis, etc. Topic derivation methods are typically based on semantic features of tweet contents. Because tweets are short by nature, such methods suffer from data sparsity. To alleviate this problem, this paper proposes a topic derivation method that incorporates tweet text similarity and interactions measures. Besides the tweet contents, the approach takes into account several types of interactions amongst tweets: Tweets which mention the same people, replies and retweets. Topic derivation is done through a two-step matrix factorization process. We conducted a number of experiments on several Twitter datasets to reveal both the individual and integrated effects of the various features being considered. Our experimental results against TREC2014 and our self collected tweetMarch datasets demonstrate that the proposed method is able to provide more than 30 percent improvement compared to other advanced topic derivation methods. |
Databáze: | OpenAIRE |
Externí odkaz: |