RACS: rapid analysis of ChIP-Seq data for contig based genomes
Autor: | Jeffrey S. Fillingham, Marcelo Ponce, Syed Nabeel-Shah, Alejandro Saettone |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Computer science
ved/biology.organism_classification_rank.species Computational biology Bioinformatics pipeline lcsh:Computer applications to medicine. Medical informatics Biochemistry Genome DNA sequencing Tetrahymena thermophila 03 medical and health sciences 0302 clinical medicine Structural Biology Next generation sequencing False positive paradox Humans Quantitative Biology - Genomics Model organism Molecular Biology lcsh:QH301-705.5 High-performance computing 030304 developmental biology Genomics (q-bio.GN) 0303 health sciences Contig ved/biology Applied Mathematics Methodology Article Chromosome Mapping Molecular Sequence Annotation Genomics Sequence Analysis DNA Pipeline (software) Chromatin immunoprecipitation Computer Science Applications Data set Pipeline transport lcsh:Biology (General) FOS: Biological sciences lcsh:R858-859.7 Chromatin Immunoprecipitation Sequencing 030217 neurology & neurosurgery |
Zdroj: | BMC Bioinformatics BMC Bioinformatics, Vol 20, Iss 1, Pp 1-17 (2019) |
ISSN: | 1471-2105 |
Popis: | Background: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely used technique to investigate the function of chromatin-related proteins in a genome-wide manner. ChIP-Seq generates large quantities of data which can be difficult to process and analyse, particularly for organisms with contig based genomes. Contig-based genomes often have poor annotations for cis-elements, for example enhancers, that are important for gene expression. Poorly annotated genomes make a comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. Methods: We report a computational pipeline that utilizes traditional High-Performance Computing techniques and open source tools for processing and analysing data obtained from ChIP-Seq. We applied our computational pipeline "Rapid Analysis of ChIP-Seq data" (RACS) to ChIP-Seq data that was generated in the model organism Tetrahymena thermophila, an example of an organism with a genome that is available in contigs. Results: To test the performance and efficiency of RACs, we performed control ChIP-Seq experiments allowing us to rapidly eliminate false positives when analyzing our previously published data set. Our pipeline segregates the found read accumulations between genic and intergenic regions and is highly efficient for rapid downstream analyses. Conclusions: Altogether, the computational pipeline presented in this report is an efficient and highly reliable tool to analyze genome-wide ChIP-Seq data generated in model organisms with contig-based genomes. RACS is an open source computational pipeline available to download from: https://bitbucket.org/mjponce/racs --or-- https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS Submitted to BMC Bioinformatics. Computational pipeline available at https://bitbucket.org/mjponce/racs |
Databáze: | OpenAIRE |
Externí odkaz: |