Integrating omics datasets with the OmicsPLS package
Autor: | Jeanine J. Houwing-Duistermaat, Szymon M. Kielbasa, Lucija Klaric, Said el Bouhaddani, Caroline Hayward, Geurt Jongbloed, Hae-Won Uh |
---|---|
Přispěvatelé: | el Bouhaddani S., Uh H.-W., Jongbloed G., Hayward C., Klaric L., Kielbasa S.M., Houwing-Duistermaat J. |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Computer science Data-specific variation Metabolomic Variation (game tree) lcsh:Computer applications to medicine. Medical informatics computer.software_genre Biochemistry 03 medical and health sciences Search engine Software Structural Biology O2PLS Humans Metabolomics Least-Squares Analysis lcsh:QH301-705.5 Molecular Biology Least-Squares Analysi Thesaurus (information retrieval) business.industry Joint principal components Applied Mathematics R package Joint principal component Genomics Omics data integration Omics Computer Science Applications Data set Task (computing) 030104 developmental biology lcsh:Biology (General) Genomic lcsh:R858-859.7 Data mining business computer Data integration Human |
Zdroj: | Bouhaddani, S E, Uh, H-W, Jongbloed, G, Hayward, C, Klarić, L, Kiełbasa, S M & Houwing-Duistermaat, J 2018, ' Integrating omics datasets with the OmicsPLS package ', BMC Bioinformatics, vol. 19, no. 1, pp. 371 . https://doi.org/10.1186/s12859-018-2371-3 BMC Bioinformatics BMC Bioinformatics, 19(1) BMC Bioinformatics, Vol 19, Iss 1, Pp 1-9 (2018) BMC Bioinformatics, 19 BMC Bioinformatics, 19(1). BioMed Central |
ISSN: | 1471-2105 |
DOI: | 10.1186/s12859-018-2371-3 |
Popis: | Background With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. Results We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. Conclusions We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages(“OmicsPLS”). Electronic supplementary material The online version of this article (10.1186/s12859-018-2371-3) contains supplementary material, which is available to authorized users. |
Databáze: | OpenAIRE |
Externí odkaz: |