Reweighting UK Biobank corrects for pervasive selection bias due to volunteering.

Autor: van Alten S; School of Business and Economics, Vrije Universiteit Amsterdam, Amsterdam, Netherlands.; Tinbergen Institute, Amsterdam, Netherlands., Domingue BW; Graduate School of Education, Stanford University, Stanford, CA, USA., Faul J; Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA., Galama T; School of Business and Economics, Vrije Universiteit Amsterdam, Amsterdam, Netherlands.; Tinbergen Institute, Amsterdam, Netherlands.; Center for Economic and Social Research and Department of Economics, University of Southern California, Los Angeles, CA, USA., Marees AT; School of Business and Economics, Vrije Universiteit Amsterdam, Amsterdam, Netherlands.
Jazyk: angličtina
Zdroj: International journal of epidemiology [Int J Epidemiol] 2024 Apr 11; Vol. 53 (3).
DOI: 10.1093/ije/dyae054
Abstrakt: Background: Biobanks typically rely on volunteer-based sampling. This results in large samples (power) at the cost of representativeness (bias). The problem of volunteer bias is debated. Here, we (i) show that volunteering biases associations in UK Biobank (UKB) and (ii) estimate inverse probability (IP) weights that correct for volunteer bias in UKB.
Methods: Drawing on UK Census data, we constructed a subsample representative of UKB's target population, which consists of all individuals invited to participate. Based on demographic variables shared between the UK Census and UKB, we estimated IP weights (IPWs) for each UKB participant. We compared 21 weighted and unweighted bivariate associations between these demographic variables to assess volunteer bias.
Results: Volunteer bias in all associations, as naively estimated in UKB, was substantial-in some cases so severe that unweighted estimates had the opposite sign of the association in the target population. For example, older individuals in UKB reported being in better health, in contrast to evidence from the UK Census. Using IPWs in weighted regressions reduced 87% of volunteer bias on average. Volunteer-based sampling reduced the effective sample size of UKB substantially, to 32% of its original size.
Conclusions: Estimates from large-scale biobanks may be misleading due to volunteer bias. We recommend IP weighting to correct for such bias. To aid in the construction of the next generation of biobanks, we provide suggestions on how to best ensure representativeness in a volunteer-based design. For UKB, IPWs have been made available.
(© The Author(s) 2024. Published by Oxford University Press on behalf of the International Epidemiological Association.)
Databáze: MEDLINE