Real-Time Novel Event Detection from Social Media
Autor: | Quanzhi Li, Xiaomo Liu, Sameena Shah, Armineh Nourbakhsh |
---|---|
Rok vydání: | 2017 |
Předmět: |
Information retrieval
Event (computing) Computer science Happening Complex event processing 02 engineering and technology Space (commercial competition) Semantics Term (time) Identification (information) 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Cluster analysis |
Zdroj: | ICDE |
Popis: | In this paper, we present a new approach for detecting novel events from social media, specially Twitter, at real-time. An event is usually defined by who, what, where and when, and an event tweet usually contains terms corresponding to these aspects. To exploit this information, we propose a method that incorporates simple semantics by splitting the tweet term space into groups of terms that have the meaning of the same type. These groups are called semantic categories (classes) and each reflects one or more event aspects. The semantic classes include named entity, mention, location, hashtag, verb, noun and embedded link. To group tweets talking about the same event into the same cluster, similarity measuring is conducted by calculating class-wise similarity and then aggregating them together. Users of a real-time event detection system are usually only interested in novel (new) events, which are happening now or just happened a short time ago. To fulfill this requirement, a temporal identification module is used to filter out event clusters that are about old stories. The clustering module also computes a novelty score for each event cluster, which reflects how novel the event is, compared to previous events. We evaluated our event detection method using multiple quality metrics and a large-scale event corpus having millions of tweets. The experiment results show that the proposed online event detection method achieves the state-of-the-art performance. Our experiment also shows that the temporal identification module can effectively detect old events. |
Databáze: | OpenAIRE |
Externí odkaz: |