Problems due to Small Samples and Sparse Data in Conditional Logistic Regression Analysis
Autor: | William D. Finkle, Sander Greenland, Judith A. Schwartzbaum |
---|---|
Rok vydání: | 2000 |
Předmět: |
Epidemiology
Computer science Matched-Pair Analysis Bayesian probability Context (language use) Cross-sectional regression Logistic regression Risk Assessment Central Nervous System Neoplasms Electromagnetic Fields Bias Statistics Odds Ratio Humans Child Multinomial logistic regression Likelihood Functions Discrete choice Leukemia fungi Multilevel model food and beverages Regression analysis Glioma Diet Logistic Models Case-Control Studies Regression Analysis Epidemiologic Methods |
Zdroj: | American Journal of Epidemiology. 151:531-539 |
ISSN: | 1476-6256 0002-9262 |
DOI: | 10.1093/oxfordjournals.aje.a010240 |
Popis: | Conditional logistic regression was developed to avoid "sparse-data" biases that can arise in ordinary logistic regression analysis. Nonetheless, it is a large-sample method that can exhibit considerable bias when certain types of matched sets are infrequent or when the model contains too many parameters. Sparse-data bias can cause misleading inferences about confounding, effect modification, dose response, and induction periods, and can interact with other biases. In this paper, the authors describe these problems in the context of matched case-control analysis and provide examples from a study of electrical wiring and childhood leukemia and a study of diet and glioma. The same problems can arise in any likelihood-based analysis, including ordinary logistic regression. The problems can be detected by careful inspection of data and by examining the sensitivity of estimates to category boundaries, variables in the model, and transformations of those variables. One can also apply various bias corrections or turn to methods less sensitive to sparse data than conditional likelihood, such as Bayesian and empirical-Bayes (hierarchical regression) methods. |
Databáze: | OpenAIRE |
Externí odkaz: |