Zobrazeno 1 - 10
of 327
pro vyhledávání: '"Rousseeuw, Peter J."'
Robust estimation provides essential tools for analyzing data that contain outliers, ensuring that statistical models remain reliable even in the presence of some anomalous data. While robust methods have long been available in R, users of Python hav
Externí odkaz:
http://arxiv.org/abs/2411.01954
Discriminant analysis (DA) is one of the most popular methods for classification due to its conceptual simplicity, low computational cost, and often solid performance. In its standard form, DA uses the arithmetic mean and sample covariance matrix to
Externí odkaz:
http://arxiv.org/abs/2408.15701
Principal component analysis (PCA) is a fundamental tool for analyzing multivariate data. Here the focus is on dimension reduction to the principal subspace, characterized by its projection matrix. The classical principal subspace can be strongly aff
Externí odkaz:
http://arxiv.org/abs/2408.13596
Autor:
Raymaekers, Jakob, Rousseeuw, Peter J.
Publikováno v:
The American Statistician, 2024
(To appear in The American Statistician.) Distance covariance (Sz\'ekely, Rizzo, and Bakirov, 2007) is a fascinating recent notion, which is popular as a test for dependence of any type between random variables $X$ and $Y$. This approach deserves to
Externí odkaz:
http://arxiv.org/abs/2406.13052
Distance correlation is a popular measure of dependence between random variables. It has some robustness properties, but not all. We prove that the influence function of the usual distance correlation is bounded, but that its breakdown value is zero.
Externí odkaz:
http://arxiv.org/abs/2403.03722
Publikováno v:
Journal of Computational and Graphical Statistics, 2024
Multivariate Singular Spectrum Analysis (MSSA) is a powerful and widely used nonparametric method for multivariate time series, which allows the analysis of complex temporal data from diverse fields such as finance, healthcare, ecology, and engineeri
Externí odkaz:
http://arxiv.org/abs/2310.01182
Publikováno v:
Machine Learning, 2024
Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standa
Externí odkaz:
http://arxiv.org/abs/2302.03931
Autor:
Raymaekers, Jakob, Rousseeuw, Peter J.
Publikováno v:
Econometrics and Statistics, 2024
It is well-known that real data often contain outliers. The term outlier typically refers to a case, that is, a row of the $n \times d$ data matrix. In recent times a different type has come into focus, the cellwise outliers. These are suspicious cel
Externí odkaz:
http://arxiv.org/abs/2302.02156
Autor:
Rousseeuw, Peter J.
Publikováno v:
Econometrics and Statistics, 2024
Often the rows (cases, objects) of a dataset have weights. For instance, the weight of a case may reflect the number of times it has been observed, or its reliability. For analyzing such data many rowwise weighted techniques are available, the most w
Externí odkaz:
http://arxiv.org/abs/2209.12697
Autor:
Raymaekers, Jakob, Rousseeuw, Peter J.
Publikováno v:
Journal of the American Statistical Association, 2025
The usual Minimum Covariance Determinant (MCD) estimator of a covariance matrix is robust against casewise outliers. These are cases (that is, rows of the data matrix) that behave differently from the majority of cases, raising suspicion that they mi
Externí odkaz:
http://arxiv.org/abs/2207.13493