Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance
Autor: | Olivier Klein, Christophe Ley, Yves Dominicy, Christophe Leys |
---|---|
Rok vydání: | 2018 |
Předmět: |
Mahalanobis distance
education.field_of_study Multivariate statistics Sociology and Political Science Social Psychology Covariance matrix 05 social sciences Population Robust statistics 050109 social psychology Bivariate analysis Méthodes de recherche en psychologie Covariance 01 natural sciences 010104 statistics & probability Statistics Outlier 0501 psychology and cognitive sciences 0101 mathematics education Psychology Social psychology |
Zdroj: | Journal of experimental social psychology |
ISSN: | 0022-1031 |
Popis: | A look at the psychology literature reveals that researchers still seem to encounter difficulties in coping with multivariate outliers. Multivariate outliers can severely distort the estimation of population parameters. Detecting multivariate outliers is mainly disregarded or done by using the basic Mahalanobis distance. However, that indicator uses the multivariate sample mean and covariance matrix that are particularly sensitive to outliers. Hence, this method is problematic. We highlight the disadvantages of the basic Mahalanobis distance and argue instead in favor of a robust Mahalanobis distance. In particular, we present a variant based on the Minimum Covariance Determinant, a more robust procedure that is easy to implement. Using Monte Carlo simulations of bivariate sample distributions varying in size (ns = 20, 100, 500) and population correlation coefficient ( ρ = .10, .30, .50), we demonstrate the detrimental impact of outliers on parameter estimation and show the superiority of the MCD over the Mahalanobis distance. We also make recommendations for deciding whether to include vs. exclude outliers. Finally, we provide the procedures for calculating this indicator in R and SPSS software. |
Databáze: | OpenAIRE |
Externí odkaz: |