Model selection by resampling penalization
Autor: | Sylvain Arlot |
---|---|
Přispěvatelé: | Laboratoire d'informatique de l'école normale supérieure (LIENS), Département d'informatique - ENS Paris (DI-ENS), Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Models of visual object recognition and scene understanding (WILLOW), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Département d'informatique de l'École normale supérieure (DI-ENS), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Paris-Rocquencourt, École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Mathématiques d'Orsay (LM-Orsay), Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS) |
Jazyk: | angličtina |
Rok vydání: | 2007 |
Předmět: |
Statistics and Probability
Mathematical optimization Heteroscedasticity model selection Statistics::Theory penalization histogram selection Mathematics - Statistics Theory 02 engineering and technology Statistics Theory (math.ST) oracle inequality 01 natural sciences adaptivity V-fold cross-validation 010104 statistics & probability non-parametric statistics resampling [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] 62G08 Resampling 62G09 62M20 0202 electrical engineering electronic engineering information engineering FOS: Mathematics Statistics::Methodology 0101 mathematics non-parametric regression Mathematics Smoothness (probability theory) Statistics::Applications Model selection Mathematical statistics Nonparametric statistics Estimator 020206 networking & telecommunications [STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] AMS 62G08 Nonparametric regression Statistics::Computation regressogram AMS 62G09 regression Statistics Probability and Uncertainty heteroscedastic data exchangeable weighted bootstrap |
Zdroj: | Electronic Journal of Statistics Electronic Journal of Statistics, Shaker Heights, OH : Institute of Mathematical Statistics, 2009, 3, pp.557--624. ⟨10.1214/08-EJS196⟩ Electron. J. Statist. 3 (2009), 557-624 Electronic Journal of Statistics, 2009, 3, pp.557--624. ⟨10.1214/08-EJS196⟩ |
ISSN: | 1935-7524 |
DOI: | 10.1214/08-EJS196⟩ |
Popis: | In this paper, a new family of resampling-based penalization procedures for model selection is defined in a general framework. It generalizes several methods, including Efron's bootstrap penalization and the leave-one-out penalization recently proposed by Arlot (2008), to any exchangeable weighted bootstrap resampling scheme. In the heteroscedastic regression framework, assuming the models to have a particular structure, these resampling penalties are proved to satisfy a non-asymptotic oracle inequality with leading constant close to 1. In particular, they are asympotically optimal. Resampling penalties are used for defining an estimator adapting simultaneously to the smoothness of the regression function and to the heteroscedasticity of the noise. This is remarkable because resampling penalties are general-purpose devices, which have not been built specifically to handle heteroscedastic data. Hence, resampling penalties naturally adapt to heteroscedasticity. A simulation study shows that resampling penalties improve on V-fold cross-validation in terms of final prediction error, in particular when the signal-to-noise ratio is not large. Comment: extended version of hal-00125455, with a technical appendix |
Databáze: | OpenAIRE |
Externí odkaz: |