Clustering and Classification Methods

Autor: Glenn W. Milligan, Stephen C. Hirtle
Rok vydání: 2012
Předmět:
Zdroj: Handbook of Psychology
DOI: 10.1002/9781118133880.hop202007
Popis: The chapter by Milligan and Hirtle provides an overview of the current state of knowledge in the field of clustering and classification. Such methods are used to find groups in multivariate data sets. The methods are discussed within the context of exploratory data analysis, though some confirmatory or testing methods are reviewed. A survey of the issues critical to the analysis of empirical data is presented along with “best practice” recommendations for the applied user. Coverage includes sections on data preparation, data models, and data representation using distance and similarity measures. The section on clustering algorithms covers a wide range of classification methods, including latent profile analysis. In addition, the algorithms section includes a discussion of the known cluster recovery performance of various selected clustering methods. The fourth section covers a variety of issues important for applied analyses such as data sampling, variable selection, variable standardization, choosing the number of clusters, and post classification analysis of the results. Threaded into the discussion are three example applications of the methodology to empirical data. The examples are based on perceived kinship data, animal similarity data, and the classification of single malt scotch whiskies. Keywords: clustering algorithms; classification validation; tree models of data; Monte Carlo methods; latent profile analysis
Databáze: OpenAIRE