Popis: |
Analysis of discrete data, and especially contingency table data, plays a central role in biostatistics. Traditional methods rely on approximations based on asymptotic results which are very powerful but not always appropriate. In this article we show that efficient rerandomization methods may be developed for many commonly used models and tests: multinomial testing, specifically goodness-of-fit and max tests; and goodness-of-fit of log-linear models for contingency tables. The feasibility (complexity) of these algorithms is a function of the sufficient statistics for the models. By contrast, algorithms which require the explicit enumeration of all outcomes in the sample space are exponential in the degrees of freedom, and are usually not feasible except when sample sizes are unrealistically small. The algorithms we present are different from recently proposed methods since we show how to calculate permutation distributions of commonly used statistics rather than calculating p-values for exact tests, and we emphasize underlying probability formulas rather than implementation details. |