Zobrazeno 1 - 10
of 24
pro vyhledávání: '"62G35, 62G05"'
Clustering is a fundamental tool in statistical machine learning in the presence of heterogeneous data. Most recent results focus primarily on optimal mislabeling guarantees when data are distributed around centroids with sub-Gaussian errors. Yet, th
Externí odkaz:
http://arxiv.org/abs/2401.05574
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamination is linked to differential privacy. Despite this fact, using multivariate medians for differenti
Externí odkaz:
http://arxiv.org/abs/2210.06459
Autor:
Sasai, Takeyuki, Fujisawa, Hironori
We consider a robust estimation of linear regression coefficients. In this note, we focus on the case where the covariates are sampled from an $L$-subGaussian distribution with unknown covariance, the noises are sampled from a distribution with a bou
Externí odkaz:
http://arxiv.org/abs/2102.11120
Autor:
Baraud, Yannick, Chen, Juntong
We observe $n$ pairs of independent (but not necessarily i.i.d.) random variables $X_{1}=(W_{1},Y_{1}),\ldots,X_{n}=(W_{n},Y_{n})$ and tackle the problem of estimating the conditional distributions $Q_{i}^{\star}(w_{i})$ of $Y_{i}$ given $W_{i}=w_{i}
Externí odkaz:
http://arxiv.org/abs/2011.01657
Autor:
Sasai, Takeyuki, Fujisawa, Hironori
We consider robust low rank matrix estimation as a trace regression when outputs are contaminated by adversaries. The adversaries are allowed to add arbitrary values to arbitrary outputs. Such values can depend on any samples. We deal with matrix com
Externí odkaz:
http://arxiv.org/abs/2010.13018
In the classical contamination models, such as the gross-error (Huber and Tukey contamination model or Case-wise Contamination), observations are considered as the units to be identified as outliers or not. This model is very useful when the number o
Externí odkaz:
http://arxiv.org/abs/1909.04325
Autor:
Minsker, Stanislav
This paper is devoted to the estimators of the mean that provide strong non-asymptotic guarantees under minimal assumptions on the underlying distribution. The main ideas behind proposed techniques are based on bridging the notions of symmetry and ro
Externí odkaz:
http://arxiv.org/abs/1812.03523
We consider the problem of multivariate location and scatter matrix estimation when the data contain cellwise and casewise outliers. Agostinelli et al. (2015) propose a two-step approach to deal with this problem: first, apply a univariate filter to
Externí odkaz:
http://arxiv.org/abs/1609.00402
Autor:
Giulini, Ilaria
We propose a stable version of Principal Component Analysis (PCA) in the general framework of a separable Hilbert space. It consists in interpreting the projection on the first eigenvectors as a step function applied to the spectrum of the covariance
Externí odkaz:
http://arxiv.org/abs/1606.00187
The generalized log-gamma (GLG) model is a very flexible family of distributions to analyze datasets in many different areas of science and technology. In this paper, we propose estimators which are simultaneously highly robust and highly efficient f
Externí odkaz:
http://arxiv.org/abs/1512.01473