Fuzzy C Means Clustering Algorithm for High Dimensional Data Using Feature Subset Selection Technique

Autor: N. Manjula, J. Jagadeesan, S. Pandiarajan
Rok vydání: 2014
Předmět:
Zdroj: IOSR Journal of Computer Engineering. 16:64-69
ISSN: 2278-0661
2278-8727
DOI: 10.9790/0661-16226469
Popis: Feature choice involves characteristic a set of the foremost helpful options that produces compatible results because the original entire set of options. A feature choice rule is also evaluated from each the potency and effectiveness points of read. Whereas the potency considerations the time needed to search out a set of options, the effectiveness is expounded to the standard of the set of options. Supported these criteria, an economical Fuzzy C Means (FCM) is projected and by experimentation evaluated in this paper. The quick rule works in 2 steps. Within the commencement, options area unit divided into clusters by exploitation graph-theoretic cluster ways. Within the second step, the foremost representative feature that's powerfully associated with target categories is chosen from every cluster to create a set of options. Options in numerous clusters area unit comparatively freelance; the clustering-based strategy of quick incorporates a high chance of manufacturing a set of helpful and independent options. To make sure the potency of quick, we have a tendency to adopt the economical Fuzzy C Means (FCM) cluster technique. The potency associated effectiveness of the quick rule area unit evaluated through an empirical study. in depth experiments area unit dole out to match quick and a number of other representative feature choice algorithms, namely, FCBF, ReliefF, CFS, Consist, and FOCUS-SF, with relevancy four kinds of well- known classifiers, namely, the chance primarily based Naive Thomas Bayes, the tree-based C4.5, the instance- based IB1,and also the rule-based liquidator before and once feature choice. The results, on thirty five in public on the market real-world high-dimensional image, microarray, and text knowledge, demonstrate that the quick not solely produces smaller subsets of options however additionally improves the performances of the four kinds of classifiers. Index Terms: feature subset selection, relevance, redundancy and high dimensionality.
Databáze: OpenAIRE