Scaling Down Dimensions and Feature Extraction in Document Repository Classification
Autor: | M. S. Josephine, Asha Kurian, V. Jeyabalaraja |
---|---|
Rok vydání: | 2014 |
Předmět: |
business.industry
Computer science Computer Science::Information Retrieval Feature vector Dimensionality reduction Document classification Feature extraction Feature selection Pattern recognition computer.software_genre ComputingMethodologies_PATTERNRECOGNITION Feature (computer vision) Principal component analysis Artificial intelligence Data mining business Cluster analysis computer |
Zdroj: | International Journal of Data Mining Techniques and Applications. 3:1-4 |
ISSN: | 2278-2419 |
DOI: | 10.20894/ijdmta.102.003.001.001 |
Popis: | In this study a comprehensive evaluation of two supervised feature selection methods for dimensionality reduction is performed - Latent Semantic Indexing (LSI) and Principal Component Analysis (PCA). This is gauged against unsupervised techniques like fuzzy feature clustering using hard fuzzy C-means (FCM) . The main objective of the study is to estimate the relative efficiency of two supervised techniques against unsupervised fuzzy techniques while reducing the feature space. It is found that clustering using FCM leads to better accuracy in classifying documents in the face of evolutionary algorithms like LSI and PCA. Results show that the clustering of features improves the accuracy of document classification. |
Databáze: | OpenAIRE |
Externí odkaz: |