Scaling Down Dimensions and Feature Extraction in Document Repository Classification

Autor: M. S. Josephine, Asha Kurian, V. Jeyabalaraja
Rok vydání: 2014
Předmět:
Zdroj: International Journal of Data Mining Techniques and Applications. 3:1-4
ISSN: 2278-2419
DOI: 10.20894/ijdmta.102.003.001.001
Popis: In this study a comprehensive evaluation of two supervised feature selection methods for dimensionality reduction is performed - Latent Semantic Indexing (LSI) and Principal Component Analysis (PCA). This is gauged against unsupervised techniques like fuzzy feature clustering using hard fuzzy C-means (FCM) . The main objective of the study is to estimate the relative efficiency of two supervised techniques against unsupervised fuzzy techniques while reducing the feature space. It is found that clustering using FCM leads to better accuracy in classifying documents in the face of evolutionary algorithms like LSI and PCA. Results show that the clustering of features improves the accuracy of document classification.
Databáze: OpenAIRE