Comparative Analysis of PCA and ANOVA for Assessing the Subset Feature Selection of the Geomagnetic Disturbance Storm Time

Autor: M Ain Dzarah Nafisah, Mohamad Huzaimy Jusoh, H Muhamad Asraf, K A Nur Dalila, M.T. Nooritawati
Rok vydání: 2020
Předmět:
Zdroj: Journal of Electrical & Electronic Systems Research. 17:8-16
ISSN: 1985-5389
Popis: A Disturbance Storm Time (Dst) index represents the geomagnetic storm strength due to interaction of the Sun towards Earth in the space weather. Formation of the Dst contributed by the total of nine (9) input features namely interplanetary magnetic field (IMF), solar wind density (SWD), solar wind speed (SWS), solar wind input energy (SWIP) and also Earth’s magnetic field components comprise of the horizontal intensity component (H), declination component (D), north component (N), east component (E), and vertical intensity component (Z). Large datasets which comprise of 157896 number of data have existed for all features thus require pre-processing and subset feature selection for reducing data dimensionality in order to reduce the data processing time and enhance the performance of the learning algorithm. In this paper two methods of analyzing the features were compared: Principal Component Analysis (PCA) and one-way Analysis of Variance (ANOVA). The main aims for this works are to reduce a large set of input parameters from the Dst index and to compare the subset feature using the proposed methods for acquiring the reduced features. Prior to analyse the features, an independent-samples t-test is used to evaluate if there is a large difference between the mean of two groups that can be correlated with certain characteristics. The results for the features analyzed demonstrated that one-way ANOVA performed better in eliminating seven (7) components out of nine (9) components of features as compared to PCA. This finding was validated with a dendrogram to support that one-way ANOVA outperformed the PCA in reducing the subset features.
Databáze: OpenAIRE