The blood DNA virome in 8,000 humans

Autor: Yaron Turpaz, William H. Biggs, Eric Delwart, Kenneth Joel Bloom, Karen E. Nelson, Amalio Telenti, Chao Xie, Ahmed A. Moustafa, Ewen F. Kirkness, Emily H. M. Wong, J. Craig Venter
Jazyk: angličtina
Rok vydání: 2017
Předmět:
0301 basic medicine
RNA viruses
Male
Physiology
viruses
Merkel cell polyomavirus
Hepacivirus
medicine.disease_cause
Database and Informatics Methods
Medicine and Health Sciences
Prevalence
Child
lcsh:QH301-705.5
Pathology and laboratory medicine
Genetics
Aged
80 and over

Viral Genomics
biology
Hepatitis C virus
virus diseases
Genomics
Medical microbiology
Middle Aged
Genomic Databases
Body Fluids
Blood
Virus Diseases
Viral evolution
Child
Preschool

Viruses
Female
Anatomy
Pathogens
Oncovirus
Research Article
lcsh:Immunologic diseases. Allergy
Adult
Adolescent
030106 microbiology
Immunology
Microbial Genomics
Research and Analysis Methods
Microbiology
Virus
Human Genomics
03 medical and health sciences
Young Adult
Virology
medicine
Humans
Human virome
Molecular Biology
Aged
Flaviviruses
Parvovirus
Organisms
Viral pathogens
Biology and Life Sciences
Computational Biology
Infant
biology.organism_classification
Genome Analysis
Genomic Libraries
Hepatitis viruses
Microbial pathogens
030104 developmental biology
Biological Databases
lcsh:Biology (General)
DNA
Viral

Parasitology
lcsh:RC581-607
DNA viruses
Reference genome
Zdroj: PLoS Pathogens
PLoS Pathogens, Vol 13, Iss 3, p e1006292 (2017)
ISSN: 1553-7374
1553-7366
Popis: The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome. Analyses sifted through close to 1 Petabyte of sequence data and performed 0.5 trillion similarity searches. With a lower bound for identification of 2 viral genomes/100,000 cells, we mapped sequences to 94 different viruses, including sequences from 19 human DNA viruses, proviruses and RNA viruses (herpesviruses, anelloviruses, papillomaviruses, three polyomaviruses, adenovirus, HIV, HTLV, hepatitis B, hepatitis C, parvovirus B19, and influenza virus) in 42% of the study participants. Of possible relevance to transfusion medicine, we identified Merkel cell polyomavirus in 49 individuals, papillomavirus in blood of 13 individuals, parvovirus B19 in 6 individuals, and the presence of herpesvirus 8 in 3 individuals. The presence of DNA sequences from two RNA viruses was unexpected: Hepatitis C virus is revealing of an integration event, while the influenza virus sequence resulted from immunization with a DNA vaccine. Age, sex and ancestry contributed significantly to the prevalence of infection. The remaining 75 viruses mostly reflect extensive contamination of commercial reagents and from the environment. These technical problems represent a major challenge for the identification of novel human pathogens. Increasing availability of human whole-genome sequences will contribute substantial amounts of data on the composition of the normal and pathogenic human blood virome. Distinguishing contaminants from real human viruses is challenging.
Author summary Novel sequencing technologies offer insight into the virome in human samples. Here, we identify the viral DNA sequences in blood of over 8,000 individuals undergoing whole genome sequencing. This approach serves to identify 94 viruses; however, many are shown to reflect widespread DNA contamination of commercial reagents or of environmental origin. While this represents a significant limitation to reliably identify novel viruses infecting humans, we could confidently detect sequences and quantify abundance of 19 human viruses in 42% of individuals. Ancestry, sex, and age were important determinants of viral prevalence. This large study calls attention on the challenge of interpreting next generation sequencing data for the identification of novel viruses. However, it serves to categorize the abundance of human DNA viruses using an unbiased technique.
Databáze: OpenAIRE