Abstrakt: |
In the big data era, there is a necessity for effective frameworks to collect, retrieve, and manage data. As not all tweets are hashtagged by users, retrieving them is a complicated task. To address this issue, we present a rule-based expert system classifier that uses the well-known concept of fingerprint in the judicial sciences. This expert system using defined rules first takes a fingerprint from the tweets of an emerging topic. After that, for being robust the fingerprint, using a rule-based search, the fingerprint with its neighbor features is to be updated. For detecting the unhashtagged tweets of the topic, each tweet in question checks itself with the generated fingerprint. By using the Twitter APIs of Streaming API and REST API, there is no way to access old Twitter data. To address this issue, we present a hybrid approach of Web scraping and Twitter streaming API. When the presented framework is compared to other similar works, there are (1) a novel two-class classification using an expert system approach that can intelligently and robustly detect the most of tweets of the emerging topics although they do not have the hashtag of the topic.; (2) a practical method for extracting old Twitter data. Also, we made a comparative text mining in 195649 collected Persian and English tweets about JCPOA. The JCPOA is one of the most important international treaties about the nuclear program between the Islamic Republic of Iran and the USA, China, France, Russia, Germany, and England. [ABSTRACT FROM AUTHOR] |