Whole genome single-nucleotide variation profile-based phylogenetic tree building methods for analysis of viral, bacterial and human genomes.

Autor: Faison WJ; The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA. Electronic address: Jamie_Faison@gwmail.gwu.edu., Rostovtsev A; Center for Biologics Evaluation and Research, US Food and Drug Administration, 1451 Rockville Pike, Rockville, MD 20852, USA. Electronic address: Alexandre.Rostovtsev@fda.hhs.gov., Castro-Nallar E; Computational Biology Institute, George Washington University, Ashburn, VA 20147, USA. Electronic address: Ecastron@gwmail.gwu.edu., Crandall KA; Computational Biology Institute, George Washington University, Ashburn, VA 20147, USA. Electronic address: Kcrandall@gwu.edu., Chumakov K; Center for Biologics Evaluation and Research, US Food and Drug Administration, 1451 Rockville Pike, Rockville, MD 20852, USA. Electronic address: Konstantin.Chumakov@fda.hhs.gov., Simonyan V; Center for Biologics Evaluation and Research, US Food and Drug Administration, 1451 Rockville Pike, Rockville, MD 20852, USA. Electronic address: Vahan.Simonyan@fda.hhs.gov., Mazumder R; The Department of Biochemistry & Molecular Medicine, George Washington University Medical Center, Washington, DC 20037, USA; McCormick Genomic and Proteomic Center, George Washington University, Washington, DC 20037, USA. Electronic address: Mazumder@gwu.edu.
Jazyk: angličtina
Zdroj: Genomics [Genomics] 2014 Jul; Vol. 104 (1), pp. 1-7. Date of Electronic Publication: 2014 Jun 12.
DOI: 10.1016/j.ygeno.2014.06.001
Abstrakt: Unlabelled: Next-generation sequencing data can be mapped to a reference genome to identify single-nucleotide polymorphisms/variations (SNPs/SNVs; called SNPs hereafter). In theory, SNPs can be compared across several samples and the differences can be used to create phylogenetic trees depicting relatedness among the samples. However, in practice this is difficult because currently there is no stand-alone tool that takes SNP data directly as input and produces phylogenetic trees. In response to this need, PhyloSNP application was created with two analysis methods 1) a quantitative method that creates the presence/absence matrix which can be directly used to generate phylogenetic trees or creates a tree from a shrunk genome alignment (includes additional bases surrounding the SNP position) and 2) a qualitative method that clusters samples based on the frequency of different bases found at a particular position. The algorithms were used to generate trees from Poliovirus, Burkholderia and human cancer genomics NGS datasets.
Availability: PhyloSNP is freely available for download at http://hive.biochemistry.gwu.edu/dna.cgi?cmd=phylosnp.
(Copyright © 2014 Elsevier Inc. All rights reserved.)
Databáze: MEDLINE