Popis: |
Irrelevant feature elimination, when used correctly, aids in enhancing the feature selection accuracy which is critical in dimensionality reduction task. The additional intelligence enhances the search for an optimal subset of features by reducing the dataset, based on the previous performance. The search procedures being used are completely probabilistic and heuristic. Although the existing algorithms use various measures to evaluate the best feature subsets, they fail to eliminate irrelevant features. The procedure explained in the current paper focuses on enhanced feature selection process based on random subset feature selection (RSFS). Random subset feature selection (RSFS) uses random forest (RF) algorithm for better feature reduction. Through an extensive testing of this procedure which is carried out on several scientific datasets previously with different geometries, we aim to show in this paper that the optimal subset of features can be derived by eliminating the features which are two standard deviations away from mean. In many real-world applications like scientific data (e.g., cancer detection, diabetes, and medical diagnosis) removing the irrelevant features result in increase in detection accuracy with less cost and time. This helps the domain experts by identifying the reduction of features and saving valuable diagnosis time. |