The blood DNA virome in 8,000 humans

Autor:	Yaron Turpaz, William H. Biggs, Eric Delwart, Kenneth Joel Bloom, Karen E. Nelson, Amalio Telenti, Chao Xie, Ahmed A. Moustafa, Ewen F. Kirkness, Emily H. M. Wong, J. Craig Venter
Jazyk:	angličtina
Rok vydání:	2017
Předmět:	0301 basic medicine RNA viruses Male Physiology viruses Merkel cell polyomavirus Hepacivirus medicine.disease_cause Database and Informatics Methods Medicine and Health Sciences Prevalence Child lcsh:QH301-705.5 Pathology and laboratory medicine Genetics Aged 80 and over Viral Genomics biology Hepatitis C virus virus diseases Genomics Medical microbiology Middle Aged Genomic Databases Body Fluids Blood Virus Diseases Viral evolution Child Preschool Viruses Female Anatomy Pathogens Oncovirus Research Article lcsh:Immunologic diseases. Allergy Adult Adolescent 030106 microbiology Immunology Microbial Genomics Research and Analysis Methods Microbiology Virus Human Genomics 03 medical and health sciences Young Adult Virology medicine Humans Human virome Molecular Biology Aged Flaviviruses Parvovirus Organisms Viral pathogens Biology and Life Sciences Computational Biology Infant biology.organism_classification Genome Analysis Genomic Libraries Hepatitis viruses Microbial pathogens 030104 developmental biology Biological Databases lcsh:Biology (General) DNA Viral Parasitology lcsh:RC581-607 DNA viruses Reference genome
Zdroj:	PLoS Pathogens PLoS Pathogens, Vol 13, Iss 3, p e1006292 (2017)
ISSN:	1553-7374 1553-7366
Popis:	The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome. Analyses sifted through close to 1 Petabyte of sequence data and performed 0.5 trillion similarity searches. With a lower bound for identification of 2 viral genomes/100,000 cells, we mapped sequences to 94 different viruses, including sequences from 19 human DNA viruses, proviruses and RNA viruses (herpesviruses, anelloviruses, papillomaviruses, three polyomaviruses, adenovirus, HIV, HTLV, hepatitis B, hepatitis C, parvovirus B19, and influenza virus) in 42% of the study participants. Of possible relevance to transfusion medicine, we identified Merkel cell polyomavirus in 49 individuals, papillomavirus in blood of 13 individuals, parvovirus B19 in 6 individuals, and the presence of herpesvirus 8 in 3 individuals. The presence of DNA sequences from two RNA viruses was unexpected: Hepatitis C virus is revealing of an integration event, while the influenza virus sequence resulted from immunization with a DNA vaccine. Age, sex and ancestry contributed significantly to the prevalence of infection. The remaining 75 viruses mostly reflect extensive contamination of commercial reagents and from the environment. These technical problems represent a major challenge for the identification of novel human pathogens. Increasing availability of human whole-genome sequences will contribute substantial amounts of data on the composition of the normal and pathogenic human blood virome. Distinguishing contaminants from real human viruses is challenging. Author summary Novel sequencing technologies offer insight into the virome in human samples. Here, we identify the viral DNA sequences in blood of over 8,000 individuals undergoing whole genome sequencing. This approach serves to identify 94 viruses; however, many are shown to reflect widespread DNA contamination of commercial reagents or of environmental origin. While this represents a significant limitation to reliably identify novel viruses infecting humans, we could confidently detect sequences and quantify abundance of 19 human viruses in 42% of individuals. Ancestry, sex, and age were important determinants of viral prevalence. This large study calls attention on the challenge of interpreting next generation sequencing data for the identification of novel viruses. However, it serves to categorize the abundance of human DNA viruses using an unbiased technique.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::819b74b67ddf11c833ae0a6a6d1746ef http://europepmc.org/articles/PMC5378407 Zobrazit plný text záznamu Plný text ve formátu PDF