MungeSumstats: a Bioconductor package for the standardization and quality control of many GWAS summary statistics

Autor: Alan E Murphy, Brian M Schilder, Nathan G. Skene
Rok vydání: 2021
Předmět:
Statistics and Probability
Technology
Biochemistry & Molecular Biology
AcademicSubjects/SCI01060
Bioinformatics
Computer science
Statistics & Probability
media_common.quotation_subject
Databases and Ontologies
Control (management)
Genome-wide association study
Biochemistry
Biochemical Research Methods
Bioconductor
Quality (business)
Molecular Biology
Data objects
01 Mathematical Sciences
media_common
Variant Call Format
Science & Technology
Information retrieval
06 Biological Sciences
File format
Applications Notes
Summary statistics
Computer Science Applications
Computational Mathematics
Biotechnology & Applied Microbiology
Computational Theory and Mathematics
Physical Sciences
Computer Science
Computer Science
Interdisciplinary Applications

Mathematical & Computational Biology
08 Information and Computing Sciences
Life Sciences & Biomedicine
Mathematics
Zdroj: Bioinformatics
ISSN: 1460-2059
1367-4803
Popis: Motivation Genome-wide association studies (GWAS) summary statistics have popularized and accelerated genetic research. However, a lack of standardization of the file formats used has proven problematic when running secondary analysis tools or performing meta-analysis studies. Results To address this issue, we have developed MungeSumstats, a Bioconductor R package for the standardization and quality control of GWAS summary statistics. MungeSumstats can handle the most common summary statistic formats, including variant call format (VCF) producing a reformatted, standardized, tabular summary statistic file, VCF or R native data object. Availability and implementation MungeSumstats is available on Bioconductor (v 3.13) and can also be found on Github at: https://neurogenomics.github.io/MungeSumstats. Supplementary information Supplementary data are available at Bioinformatics online.
Databáze: OpenAIRE