Interoperable and scalable data analysis with microservices: applications in metabolomics

Autor: Namrata Kale, Sven Bergmann, Philippe Rocca-Serra, Kenneth Haug, Kristian Peters, Gianluigi Zanetti, Ola Spjuth, Kim Kultima, Christoph Ruttkies, Etienne A. Thévenot, David Johnson, Marco Capuccini, Carles Foguet, Payam Emami Khoonsari, Rico Rueedi, Anders Larsson, Pedro de Atauri, Vitaly A. Selivanov, Pierrick Roger, Pablo Moreno, Luca Pireddu, Noureddin Sadawi, Christoph Steinbeck, Sijin He, Marta Cascante, Stephanie Herman, Susanna-Assunta Sansone, Michael van Vliet, Daniel Schober, Thomas Hankemeier, Matteo Carone, Joachim Burman, Steffen Neumann, Reza M. Salek, Alejandra Gonzalez-Beltran
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Data Analysis
Statistics and Probability
Source code
Programari
Bioinformatics
Computer science
media_common.quotation_subject
Distributed computing
Interoperability
Microservices
kubernetes
Biochemistry
Internetworking (Telecommunication)
Field (computer science)
Workflow
03 medical and health sciences
microservices
0302 clinical medicine
Software
Interoperabilitat en xarxes d'ordinadors
Metabolomics
Computer software
Molecular Biology
e-infrastructure
media_common
030304 developmental biology
Bioinformatics (Computational Biology)
0303 health sciences
Docker
Mass spectrometry
business.industry
Systems Biology
Computational Biology
container
Original Papers
metabolomics
Computer Science Applications
Computational Mathematics
Espectrometria de masses
Computational Theory and Mathematics
Scalability
Container (abstract data type)
Bioinformatik (beräkningsbiologi)
Software engineering
business
030217 neurology & neurosurgery
Zdroj: Bioinformatics, vol. 35, no. 19, pp. 3752-3760
Dipòsit Digital de la UB
Universidad de Barcelona
Bioinformatics, 35(19), 3752-3760
Bioinformatics
Popis: Motivation Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. Results We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. Availability and implementation The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. Supplementary information Supplementary data are available at Bioinformatics online.
Databáze: OpenAIRE