Influential observations in weighted analyses: examples from the National Longitudinal Survey of Children and Youth (NLSCY)

Autor: Jennifer J, Macnab, J J, Koval, K N, Speechley, M K, Campbell
Rok vydání: 2005
Předmět:
Zdroj: Chronic diseases in Canada. 26(1)
ISSN: 0228-8699
Popis: This paper highlights the impact of survey weights on model fit in multiple linear regression with specific reference to the National Longitudinal Survey of Children and Youth (NLSCY) and provides recommendations for the treatment of influential observations. Multiple linear regression was used to estimate the association between child and family factors in the preschool years and vocabulary development at school age. Analyses were performed with and without survey weights. The model fit was assessed by examining the distribution of the studentized residuals and the change in the regression coefficients that would occur if an observation were removed. Two summary measures of influence, Dffits and Cook's D are reported. The models were refit excluding influential observations. Weighting of the linear model resulted in previously non-influential observations having an undue influence on the estimation of the regression parameters in the weighted model. The influential observations were driven primarily by the size of the survey weight as opposed to unusual values of x and y. Researchers working with large national health surveys such as the NLSCY and the National Population Health Survey (NPHS) are advised to include a detailed influence analysis before any final conclusions are made.
Databáze: OpenAIRE