QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays
Autor: | Roger Pique-Regi, Francesca Luca, Christopher D. Brown, Xiaoquan Wen, Cynthia A Kalita, Gregory A Moyerbrailean |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Statistics and Probability Computer science Single-nucleotide polymorphism Computational biology Allelic Imbalance Regulatory Sequences Nucleic Acid Biology Polymorphism Single Nucleotide Biochemistry Genome 03 medical and health sciences 0302 clinical medicine Plasmid Gene expression Humans Allele Enhancer Molecular Biology Gene Alleles 030304 developmental biology Genetics Regulation of gene expression 0303 health sciences Genome Human Computational Biology Replicate Original Papers Computer Science Applications Computational Mathematics 030104 developmental biology Gene Expression Regulation Computational Theory and Mathematics Regulatory sequence Nucleic acid Human genome Software 030217 neurology & neurosurgery |
Zdroj: | Bioinformatics. 34:787-794 |
ISSN: | 1367-4811 1367-4803 |
DOI: | 10.1093/bioinformatics/btx598 |
Popis: | Motivation The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an enhancer affects gene regulation. Recently, allele-specific analysis in high-throughput reporter assays, such as massively parallel reporter assays (MPRAs), have been used to functionally validate non-coding variants. However, we are still missing high-quality and robust data analysis tools for these datasets. Results We have further developed our method for allele-specific analysis QuASAR (quantitative allele-specific analysis of reads) to analyze allele-specific signals in barcoded read counts data from MPRA. Using this approach, we can take into account the uncertainty on the original plasmid proportions, over-dispersion, and sequencing errors. The provided allelic skew estimate and its standard error also simplifies meta-analysis of replicate experiments. Additionally, we show that a beta-binomial distribution better models the variability present in the allelic imbalance of these synthetic reporters and results in a test that is statistically well calibrated under the null. Applying this approach to the MPRA data, we found 602 SNPs with significant (false discovery rate 10%) allele-specific regulatory function in LCLs. We also show that we can combine MPRA with QuASAR estimates to validate existing experimental and computational annotations of regulatory variants. Our study shows that with appropriate data analysis tools, we can improve the power to detect allelic effects in high-throughput reporter assays. Availability and implementation http://github.com/piquelab/QuASAR/tree/master/mpra Supplementary information Supplementary data are available online at Bioinformatics. |
Databáze: | OpenAIRE |
Externí odkaz: |