Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages
Autor: | Virginie de Schaetzen, Robin Duque, Jonatan Taminau, David Y. Weiss Solís, Colin Molter, Alain Coletta, Hugues Bersini, Stijn Meganck, Ann Nowé, Cosmin Lazar, David Steenhoff |
---|---|
Přispěvatelé: | Computational Modelling, Department of Bio-engineering Sciences, Informatics and Applied Informatics |
Jazyk: | angličtina |
Předmět: |
Computer science
Biochimie Informatique appliquée logiciel Context (language use) lcsh:Computer applications to medicine. Medical informatics computer.software_genre Biochemistry Access to Information Bioconductor Set (abstract data type) Machine learning for data mining Structural Biology Batch effect removal Gene expression Humans Microarray databases lcsh:QH301-705.5 Molecular Biology Oligonucleotide Array Sequence Analysis Microarray analysis techniques Gene Expression Profiling Applied Mathematics Biologie moléculaire InSilico DB artificial intelligence Reproducibility Computer Science Applications Gene expression profiling Microarray repositories lcsh:Biology (General) Gene chip analysis lcsh:R858-859.7 Data integration Data mining DNA microarray computer Software |
Zdroj: | Vrije Universiteit Brussel BMC Bioinformatics BMC Bioinformatics, Vol 13, Iss 1, p 335 (2012) BMC bioinformatics, 13 (1 |
ISSN: | 1471-2105 |
DOI: | 10.1186/1471-2105-13-335 |
Popis: | Background: With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck.Results: We present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well.Conclusions: By providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/]. © 2012 Taminau et al. licensee BioMed Central Ltd. SCOPUS: ar.j info:eu-repo/semantics/published |
Databáze: | OpenAIRE |
Externí odkaz: |