Fire Emergency Detection from Twitter Using Supervised Principal

Autor: Mohammed Ahsan Raza Noori, Ritika Mehra
Rok vydání: 2020
Předmět:
Zdroj: ICIIS
Popis: Principal Component Analysis (PCA) is primarily a dimensionality reduction technique used in the area of unsupervised machine learning, while the use of PCA in the area of supervised machine learning is still in progress. In the field of supervised event detection from social media, PCA is not well explored by the researchers to avoid the curse of high dimensionality produced by the Vector Space Model (VSM). In this work, we proposed a supervised event detection system, which detect the occurrence of fire emergency from Twitter streaming data in near real-time using supervised PCA as a dimensional reduction technique. Our aim is to find the minimum number of Principal Components (PC’s) that can contribute towards achieving the highest classification performance. We used three machine learning algorithms for classification, Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The performance of these algorithms in conjunction with their corresponding PC’s has been compared. Our experimental study has shown that LR outperforms the other two algorithms and achieves the highest accuracy of 91% using 710 PC’s out of 1,000 dimensions. From the results, LR as a classifier is used to build the actual system. To process high dimensional data in batch as well as in near real-time we used Apache Spark framework.
Databáze: OpenAIRE