Popis: |
This paper proposes two extensions to a Multi-Label Correlation Based Feature Selection Method (ML-CFS): (1) ML-CFS using the absolute value of the correlation coefficient in the equation for evaluating a candidate feature subset, and (2) ML-CFS using Mutual Information for class label weighting. These extensions are evaluated in a bioinformatics case study addressing the multi-label classification of a cancer-related DNA micro array dataset with over 20,000 features. The results show that ML-CFS with absolute value of correlation obtained a significantly better predictive accuracy (smaller hamming loss) than the original ML-CFS. On the other hand, using Mutual Information to assign weights to labels showed some positive effect when using the ML-RBF classifier, but it showed a negative effect when using the ML-kNN classifier. |