OLOGRAM: Determining significance of total overlap length between genomic regions sets.
Autor: | Ferré Q; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Aix Marseille Univ, CNRS, UMR, LIS, Qarma, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer., Charbonnier G; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer., Sadouni N; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer., Lopez F; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer., Kermezli Y; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer.; Tlemcen University, The Laboratory of Applied Molecular Biology and Immunology, Algeria., Spicuglia S; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer., Capponi C; Aix Marseille Univ, CNRS, UMR, LIS, Qarma, Marseille, France., Ghattas B; Aix Marseille Univ, CNRS, UMR, IMM, Marseille, France., Puthier D; Aix Marseille Univ, INSERM, UMR U1090, TAGC, Marseille, France.; Equipe Labellisée LIGUE contre le Cancer. |
---|---|
Jazyk: | angličtina |
Zdroj: | Bioinformatics (Oxford, England) [Bioinformatics] 2019 Nov 05. Date of Electronic Publication: 2019 Nov 05. |
DOI: | 10.1093/bioinformatics/btz810 |
Abstrakt: | Motivation: Various bioinformatics analyses provide sets of genomic coordinates of interest. Whether two such sets possess a functional relation is a frequent question. This is often determined by interpreting the statistical significance of their overlaps. However, only few existing methods consider the lengths of the overlap, and they do not provide a resolutive p-value. Results: Here, we introduce OLOGRAM, which performs overlap statistics between sets of genomic regions described in BEDs or GTF. It uses Monte Carlo simulation, taking into account both the distributions of region and inter-region lengths, to fit a negative binomial model of the total overlap length. Exclusion of user-defined genomic areas during the shuffling is supported. Availability: This tool is available through the command line interface of the pygtftk toolkit. It has been tested on Linux and OSX and is available on Bioconda and from https://github.com/dputhier/pygtftk under the GNU GPL license. Supplementary Information: Supplementary data are available at Bioinformatics online. (© The Author(s) (2019). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.) |
Databáze: | MEDLINE |
Externí odkaz: |