Abstrakt: |
Graph-based semisupervised learning has become an indispensable tool for data classification recently, owing to its innate capability of efficient data structuring and representation. However, their reliance on predefined graphs constrains the efficacy of label propagation (LP) and interpretability in predictions, especially in high-dimensional feature spaces with limited information. Addressing these challenges, this article employs a fuzzy graph-based label propagation (FGLP) model, which is inherently interpretable in exploring the similarities of the normalized histogram envelope-based scaled features assisting data categorization. FGLP initiates the structuring of an undirected fuzzy-weighted graph using the novel fuzzy distance matrix by exploiting the local data affinity with reduced influence of outliers. The learned information is then optimized using two distinct SoftMax-constrained objective functions coupled with cross-entropy and lasso regularization to construct the similarity and projection matrices in tandem assisting LP and feature selection in scarcely labeled high-dimensional feature space. Performance validation on heterogeneous datasets showcases FGLP's superiority, achieving over 88% accuracy with just 10% labeled data, surpassing prior methods by an average enhancement of 18.29%. |