Identifying Retweetable Tweets with a Personalized Global Classifier
Autor: | Michail Vougioukas, Georgios Paliouras, Ion Androutsopoulos |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: |
Social and Information Networks (cs.SI)
FOS: Computer and information sciences Information retrieval Computer science User modeling Novelty Computer Science - Social and Information Networks 02 engineering and technology Approx Personalization 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Social media Communication source Classifier (UML) |
Zdroj: | SETN |
Popis: | In this paper we present a method to identify tweets that a user may find interesting enough to retweet. The method is based on a global, but personalized classifier, which is trained on data from several users, represented in terms of user-specific features. Thus, the method is trained on a sufficient volume of data, while also being able to make personalized decisions, i.e., the same post received by two different users may lead to different classification decisions. Experimenting with a collection of approx.\ 130K tweets received by 122 journalists, we train a logistic regression classifier, using a wide variety of features: the content of each tweet, its novelty, its text similarity to tweets previously posted or retweeted by the recipient or sender of the tweet, the network influence of the author and sender, and their past interactions. Our system obtains F1 approx. 0.9 using only 10 features and 5K training instances. This is a long paper version of the extended abstract titled "A Personalized Global Filter To Predict Retweets", of the same authors, which was published in the 25th ACM UMAP conference in Bratislava, Slovakia, in July 2017 |
Databáze: | OpenAIRE |
Externí odkaz: |