Pathogen exposure misclassification can bias association signals in GWAS of infectious diseases when using population-based common control subjects.

Autor: Duchen D; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA., Vergara C; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA., Thio CL; Division of Infectious Diseases, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA., Kundu P; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA., Chatterjee N; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA., Thomas DL; Division of Infectious Diseases, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA., Wojcik GL; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA., Duggal P; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA. Electronic address: pduggal@jhu.edu.
Jazyk: angličtina
Zdroj: American journal of human genetics [Am J Hum Genet] 2023 Feb 02; Vol. 110 (2), pp. 336-348. Date of Electronic Publication: 2023 Jan 16.
DOI: 10.1016/j.ajhg.2022.12.013
Abstrakt: Genome-wide association studies (GWASs) have been performed to identify host genetic factors for a range of phenotypes, including for infectious diseases. The use of population-based common control subjects from biobanks and extensive consortia is a valuable resource to increase sample sizes in the identification of associated loci with minimal additional expense. Non-differential misclassification of the outcome has been reported when the control subjects are not well characterized, which often attenuates the true effect size. However, for infectious diseases the comparison of affected subjects to population-based common control subjects regardless of pathogen exposure can also result in selection bias. Through simulated comparisons of pathogen-exposed cases and population-based common control subjects, we demonstrate that not accounting for pathogen exposure can result in biased effect estimates and spurious genome-wide significant signals. Further, the observed association can be distorted depending upon strength of the association between a locus and pathogen exposure and the prevalence of pathogen exposure. We also used a real data example from the hepatitis C virus (HCV) genetic consortium comparing HCV spontaneous clearance to persistent infection with both well-characterized control subjects and population-based common control subjects from the UK Biobank. We find biased effect estimates for known HCV clearance-associated loci and potentially spurious HCV clearance associations. These findings suggest that the choice of control subjects is especially important for infectious diseases or outcomes that are conditional upon environmental exposures.
Competing Interests: Declaration of interests All authors declare no competing interests.
(Copyright © 2022 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.)
Databáze: MEDLINE