GeneSetCluster: a tool for summarizing and integrating gene-set analysis results
Autor: | David Gomez-Cabrero, Ewoud Ewing, Nuria Planell-Picola, Maja Jagodic |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Multiple Sclerosis
Computer science Dimethyl Fumarate Overlapping pathways Interval (mathematics) lcsh:Computer applications to medicine. Medical informatics Biochemistry Cell Line Set (abstract data type) 03 medical and health sciences User-Computer Interface 0302 clinical medicine Structural Biology Cluster Analysis Data Mining Humans Data-mining Cluster analysis Molecular Biology Gene lcsh:QH301-705.5 030304 developmental biology 0303 health sciences Clustering gene-sets Information retrieval Applied Mathematics Clustering pathways DNA Methylation Computer Science Applications Identification (information) lcsh:Biology (General) 030220 oncology & carcinogenesis Gene-set enrichment DNA methylation lcsh:R858-859.7 DNA microarray Reactive Oxygen Species Overlapping gene Software |
Zdroj: | BMC Bioinformatics, Vol 21, Iss 1, Pp 1-7 (2020) BMC Bioinformatics |
ISSN: | 1471-2105 |
DOI: | 10.1186/s12859-020-03784-z |
Popis: | Background Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequencing. Gene-set analysis tools return, as a result, a list of significant gene-sets. However, while these results are useful for the researcher in the identification of major biological insights, they may be complex to interpret because many gene-sets have largely overlapping gene contents. Additionally, in many cases the result of gene-set analysis consists of a large number of gene-sets making it complicated to identify the major biological insights. Results We present GeneSetCluster, a novel approach which allows clustering of identified gene-sets, from one or multiple experiments and/or tools, based on shared genes. GeneSetCluster calculates a distance score based on overlapping gene content, which is then used to cluster them together and as a result, GeneSetCluster identifies groups of gene-sets with similar gene-set definitions (i.e. gene content). These groups of gene-sets can aid the researcher to focus on such groups for biological interpretations. Conclusions GeneSetCluster is a novel approach for grouping together post gene-set analysis results based on overlapping gene content. GeneSetCluster is implemented as a package in R. The package and the vignette can be downloaded at https://github.com/TranslationalBioinformaticsUnit |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |