Abstrakt: |
Selection of relevant features is vitally important in machine learning tasks involving large datasets with numerous features. It helps in reducing the dimensionality of a dataset and improving model performance. This study introduces a feature selection technique named μ -Relief, which is based on ReliefF, one of the most extensively used Relief-based algorithms. μ -Relief effectively determines the most relevant feature subset and significantly outperforms the ReliefF algorithm. ReliefF estimates feature quality considering only the nearest neighbors, resulting in low classification accuracy on non-uniformly distributed or noisy datasets. The proposed μ -Relief technique considers neighbors with more effective information on the basis of mean distance. It utilizes neighbors far from the mean distance to obtain feature weight estimates, which improves the algorithm's performance. The algorithm was tested on thirteen real-world datasets and validated on three synthetic datasets. Its effectiveness in selecting relevant features was evaluated by comparing it to other well-known feature selection algorithms, namely Chi-Square, ANOVA, MI, CMIM, MRMR, SURF*, MultiSURF, MultiSURF*, and ReliefF. When evaluated using multiple classifiers trained on the features selected by different feature selection techniques, the metrics of classification accuracy, weighted F1-score, and ROC-AUC, showed that μ -Relief effectively determined relevant features and outperformed other techniques. [ABSTRACT FROM AUTHOR] |