Experimenting with Feature Constraints Using the NBC Clustering Algorithm.

Autor: Lasek, Piotr, Lasek, Krzysztof
Předmět:
Zdroj: Procedia Computer Science; 2024, Vol. 246, p2714-2722, 9p
Abstrakt: Clustering is a widely utilized data mining technique aimed at discovering unknown patterns in datasets, typically performed in an unsupervised manner. However, the incorporation of external knowledge, transforming it into a semi-supervised process, has shown promising results in improving clustering performance. This study explores the application of feature constraints in clustering algorithms, specifically focusing on relative constraints, which are effective in embedding domain knowledge. We introduce a method to use relative constraints as features, integrating them into the Neighborhood-Based Clustering algorithm. Our experiments demonstrate the impact of these constraints on clustering accuracy and efficiency using various benchmark datasets. The results indicate that the proposed method enhances clustering quality by effectively incorporating domain-specific constraints, offering a novel approach to semi-supervised clustering. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index