Unsupervised probabilistic feature selection using ant colony optimization
Autor: | Behrouz Zamani Dadaneh, Hossein Yeganeh Markid, Ali Zakerolhosseini |
---|---|
Rok vydání: | 2016 |
Předmět: |
0209 industrial biotechnology
Computer science business.industry Ant colony optimization algorithms General Engineering Probabilistic logic Pattern recognition Feature selection 02 engineering and technology Machine learning computer.software_genre Computer Science Applications Support vector machine Naive Bayes classifier 020901 industrial engineering & automation Ranking Artificial Intelligence Feature (computer vision) Pattern recognition (psychology) 0202 electrical engineering electronic engineering information engineering Minimum redundancy feature selection 020201 artificial intelligence & image processing Artificial intelligence business computer |
Zdroj: | Expert Systems with Applications. 53:27-42 |
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2016.01.021 |
Popis: | We proposed an unsupervised method to remove redundant and irrelevant features.The algorithm needs no learning algorithms and class label to select features.Similarity between features will be considered in computation of feature relevance. Feature selection (FS) is one of the most important fields in pattern recognition, which aims to pick a subset of relevant and informative features from an original feature set. There are two kinds of FS algorithms depending on the presence of information about dataset class labels: supervised and unsupervised algorithms. Supervised approaches utilize class labels of dataset in the process of feature selection. On the other hand, unsupervised algorithms act in the absence of class labels, which makes their process more difficult. In this paper, we propose unsupervised probabilistic feature selection using ant colony optimization (UPFS). The algorithm looks for the optimal feature subset in an iterative process. In this algorithm, we utilize inter-feature information which shows the similarity between the features that leads the algorithm to decreased redundancy in the final set. In each step of the ACO algorithm, to select the next potential feature, we calculate the amount of redundancy between current feature and all those which have been selected thus far. In addition, we utilize a matrix to hold ant related pheromone which shows the rate of the co-presence of every pair of features in solutions. Afterwards, features are ranked based on a probability function extracted from the matrix; then, their m-top is returned as the final solution. We compare the performance of UPFS with 15 well-known supervised and unsupervised feature selection methods using different classifiers (support vector machine, naive Bayes, and k-nearest neighbor) on 10 well-known datasets. The experimental results show the efficiency of the proposed method compared to the previous related methods. |
Databáze: | OpenAIRE |
Externí odkaz: |