Abstrakt: |
Classifying Tweet's contents can become a useful feature for other application tasks. However, such classification can be quite challenging due to the short length and sparsity of tweet contents. Although individual tweets have limited length, their contents delve into different topics. Therefore, due to such diverse contents, achieving good coverage of content features remains a challenge. We adopt the expansion of keywords technique in this research and study the enrichment of tweet contents using text from credible sources, such as news sites. For evaluation, we conduct experiments on two Twitter datasets using four standard classifiers. The proposed approach has enhanced the performance of the classification task, with improvements in accuracy ranging from +0.05% to +3.54% for both datasets. Experimental results positively demonstrate that the proposed feature enrichment method can overcome the sparseness limitation of short text with improved classification performances when running on various classifiers. |