Popis: |
This paper presents an overview of selected clustering models and shows an application of K-Means algorithm to document clustering. In the introductory part, the definitions of basic concepts and common characteristics of clustering models are described. Then an overview of clustering models is given. The methods of clustering, basic characteristics, visualization and possible input data for each algorithm are presented. The authors also explain the assessment of each algorithm taking into consideration measures such as Rand index, homogeneity completeness, V-measure and Silhouette coefficient. Furthermore, the paper describes the application of the K-Means algorithm to document clustering showing the final result and elaborating the procedures applied when clustering the documents. |