Ubiquitous Bias and False Discovery Due to Model Misspecification in Analysis of Statistical Interactions: The Role of the Outcome's Distribution and Metric Properties.

Autor: Domingue, Benjamin W., Kanopka, Klint, Trejo, Sam, Rhemtulla, Mijke, Tucker-Drob, Elliot M.
Zdroj: Psychological Methods; Dec2024, Vol. 29 Issue 6, p1164-1179, 16p
Abstrakt: Studies of interaction effects are of great interest because they identify crucial interplay between predictors in explaining outcomes. Previous work has considered several potential sources of statistical bias and substantive misinterpretation in the study of interactions, but less attention has been devoted to the role of the outcome variable in such research. Here, we consider bias and false discovery associated with estimates of interaction parameters as a function of the distributional and metric properties of the outcome variable. We begin by illustrating that, for a variety of noncontinuously distributed outcomes (i.e., binary and count outcomes), attempts to use the linear model for recovery leads to catastrophic levels of bias and false discovery. Next, focusing on transformations of normally distributed variables (i.e., censoring and noninterval scaling), we show that linear models again produce spurious interaction effects. We provide explanations offering geometric and algebraic intuition as to why interactions are a challenge for these incorrectly specified models. In light of these findings, we make two specific recommendations. First, a careful consideration of the outcome's distributional properties should be a standard component of interaction studies. Second, researchers should approach research focusing on interactions with heightened levels of scrutiny. There is great scientific interest in the degree to which responses to some common stimulus vary across people. Many tests of such variation involve the statistical analysis of interaction terms. We use a variety of evidence (geometric, algebraic, simulation) to argue that incorrect inferences may be made in many cases if details of the outcome variable are not closely monitored. In particular, we show that false positives will result in many cases if a model is not well-suited to the nature of the outcome variable. We offer illustrations from the literature of places where such confusion can occur. We believe that an increased understanding of this problem would lead to improved scientific inquiry and more efficient use of research funds. [ABSTRACT FROM AUTHOR]
Databáze: Supplemental Index