XCLS: A Fast and Effective Clustering Algorithm for Heterogenous XML Documents.

Autor: Ng, Wee Keong, Kitsuregawa, Masaru, Li, Jianzhong, Chang, Kuiyu, Nayak, Richi, Xu, Sumei
Zdroj: Advances in Knowledge Discovery & Data Mining (9783540332060); 2006, p292-302, 11p
Abstrakt: We present a novel clustering algorithm to group the XML documents by similar structures. We introduce a Level structure format to represent the XML documents for efficient processing. We develop a global criterion function that do not require the pair-wise similarity to be computed between two individual documents, rather measures the similarity at clustering level utilising structural information of the XML documents. The experimental analysis shows the method to be fast and accurate. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index