Popis: |
It is reported that there are hundreds of thousands of deaths caused by seasonal flu all around the world every year. More other diseases such as chickenpox, malaria, etc. are also serious threat to people's physical and mental health. Therefore proper techniques for disease surveillance are highly demanded. Recently, social media analysis is regarded as an efficient way to achieve this goal, which is feasible since growing number of people post their health information to social media such as blogs, personal website, etc. Previous work on social media analysis mainly focused on English materials but hardly considered Chinese materials, which hinders the use of such technique for Chinese people. In this paper, we proposed a new method of Chinese social media analysis for disease surveillance. More specifically, we compared different kinds of methods in the process of classification, and then proposed a new way to process Chinese text data. The Chinese Sina micro-blog data collected from September to December 2013 is used to validate the effectiveness of the proposed method. The results show that a high classification precision of 87.49% in average is obtained. Comparing with the data from the authority, Chinese national influenza center, we can predict the outbreak time of flu 5 days earlier |