Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips
Autor: | James Wiley, Helmut Butzkueven, Trevor Kilpatrick, Simon Foote, Jeannette Lechner-Scott, Ruijie Liu, Allan Kermode, JUSTIN RUBIO, Matthew A Brown, Matthew Ritchie, Tony Merriman, Jim Stankovich, Ruth Topless, Pablo Moscato, Lyn Griffiths, Andrea Polanowski, Benilton Carvalho, Peter Csurhes, Michael Pender, Mark Slee, Rodney Scott, Judith Greer, Judith Field, Joanne Dickinson |
---|---|
Rok vydání: | 2011 |
Předmět: |
Genotype
Genome-wide association study Single-nucleotide polymorphism Biology lcsh:Computer applications to medicine. Medical informatics Polymorphism Single Nucleotide Biochemistry 03 medical and health sciences 0302 clinical medicine Gene Frequency Structural Biology Cluster Analysis Humans International HapMap Project lcsh:QH301-705.5 Molecular Biology Genotyping Allele frequency Alleles 030304 developmental biology Genetic association 0303 health sciences Models Statistical Methodology Article Applied Mathematics Computer Science Applications Minor allele frequency lcsh:Biology (General) Sample size determination Sample Size lcsh:R858-859.7 Algorithm Algorithms 030217 neurology & neurosurgery Genome-Wide Association Study |
Zdroj: | BMC Bioinformatics BMC Bioinformatics, Vol 12, Iss 1, p 68 (2011) |
ISSN: | 1471-2105 |
DOI: | 10.1186/1471-2105-12-68 |
Popis: | Background Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study. Results In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics. Conclusions CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended. |
Databáze: | OpenAIRE |
Externí odkaz: |