Errors in Statistical Inference Under Model Misspecification: Evidence, Hypothesis Testing, and AIC

Autor:	José Miguel Ponciano, Brian Dennis, Mark L. Taper, Subhash R. Lele
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	0106 biological sciences 0301 basic medicine model selection Kullback–Leibler divergence Kullback-Leibler divergence Akaike’s information criterion lcsh:Evolution Information Criteria 010603 evolutionary biology 01 natural sciences Article 03 medical and health sciences Frequentist inference lcsh:QH540-549.5 Statistics hypothesis testing lcsh:QH359-425 Statistical inference model misspecification Ecology Evolution Behavior and Systematics Mathematics Statistical hypothesis testing Ecology evidence Model selection evidential statistics 030104 developmental biology Sample size determination error rates in model selection lcsh:Ecology Type I and type II errors
Zdroj:	Frontiers in ecology and evolution Frontiers in Ecology and Evolution, Vol 7 (2019)
ISSN:	2296-701X
Popis:	The methods for making statistical inferences in scientific analysis have diversified even within the frequentist branch of statistics, but comparison has been elusive. We approximate analytically and numerically the performance of Neyman-Pearson hypothesis testing, Fisher significance testing, information criteria, and evidential statistics (Royall, 1997). This last approach is implemented in the form of evidence functions: statistics for comparing two models by estimating, based on data, their relative distance to the generating process (i.e., truth) (Lele, 2004). A consequence of this definition is the salient property that the probabilities of misleading or weak evidence, error probabilities analogous to Type 1 and Type 2 errors in hypothesis testing, all approach 0 as sample size increases. Our comparison of these approaches focuses primarily on the frequency with which errors are made, both when models are correctly specified, and when they are misspecified, but also considers ease of interpretation. The error rates in evidential analysis all decrease to 0 as sample size increases even under model misspecification. Neyman-Pearson testing on the other hand, exhibits great difficulties under misspecification. The real Type 1 and Type 2 error rates can be less, equal to, or greater than the nominal rates depending on the nature of model misspecification. Under some reasonable circumstances, the probability of Type 1 error is an increasing function of sample size that can even approach 1! In contrast, under model misspecification an evidential analysis retains the desirable properties of always having a greater probability of selecting the best model over an inferior one and of having the probability of selecting the best model increase monotonically with sample size. We show that the evidence function concept fulfills the seeming objectives of model selection in ecology, both in a statistical as well as scientific sense, and that evidence functions are intuitive and easily grasped. We find that consistent information criteria are evidence functions but the MSE minimizing (or efficient) information criteria (e.g., AIC, AICc, TIC) are not. The error properties of the MSE minimizing criteria switch between those of evidence functions and those of Neyman-Pearson tests depending on models being compared.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e787c8875e0d9e2e9128779f4cd41fc5 http://europepmc.org/articles/PMC8293863 Zobrazit plný text záznamu