The minimum regularized covariance determinant estimator

Autor: Kris Boudt, Peter J. Rousseeuw, Steven Vanduffel, Tim Verdonck
Přispěvatelé: Econometrics and Data Science, Business, Vrije Universiteit Brussel
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Statistics and Probability
FOS: Computer and information sciences
Technology
Statistics & Probability
ROBUST
010103 numerical & computational mathematics
01 natural sciences
Regularization (mathematics)
Theoretical Computer Science
Methodology (stat.ME)
010104 statistics & probability
Matrix (mathematics)
Dimension (vector space)
Robustness (computer science)
Scatter matrix
Computer Science
Theory & Methods

Regularization
Convex combination
ALGORITHM
0101 mathematics
Statistics - Methodology
Computer. Automation
Science & Technology
Estimator
Covariance
MULTIVARIATE LOCATION
High-dimensional data
SCATTER
Computational Theory and Mathematics
Breakdown value
Physical Sciences
Computer Science
OUTLIER DETECTION
Robust covariance estimation
Statistics
Probability and Uncertainty

SDG 12 - Responsible Consumption and Production
Algorithm
MATRIX
Mathematics
Zdroj: Statistics and Computing, 30(1), 113-128. Springer Netherlands
Statistics and computing
Boudt, K, Rousseeuw, P J, Vanduffel, S & Verdonck, T 2020, ' The minimum regularized covariance determinant estimator ', Statistics and Computing, vol. 30, no. 1, pp. 113-128 . https://doi.org/10.1007/s11222-019-09869-x
ISSN: 0960-3174
DOI: 10.1007/s11222-019-09869-x
Popis: © 2019, Springer Science+Business Media, LLC, part of Springer Nature. The minimum covariance determinant (MCD) approach estimates the location and scatter matrix using the subset of given size with lowest sample covariance determinant. Its main drawback is that it cannot be applied when the dimension exceeds the subset size. We propose the minimum regularized covariance determinant (MRCD) approach, which differs from the MCD in that the scatter matrix is a convex combination of a target matrix and the sample covariance matrix of the subset. A data-driven procedure sets the weight of the target matrix, so that the regularization is only used when needed. The MRCD estimator is defined in any dimension, is well-conditioned by construction and preserves the good robustness properties of the MCD. We prove that so-called concentration steps can be performed to reduce the MRCD objective function, and we exploit this fact to construct a fast algorithm. We verify the accuracy and robustness of the MRCD estimator in a simulation study and illustrate its practical use for outlier detection and regression analysis on real-life high-dimensional data sets in chemistry and criminology. ispartof: STATISTICS AND COMPUTING vol:30 issue:1 pages:113-128 status: published
Databáze: OpenAIRE