Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language

Autor: Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, The CWL Community
Přispěvatelé: Computer Systems, Network Institute, Bioinformatics, AIMMS, Integrative Bioinformatics, Massivizing Computer Systems, Vrije Universiteit Amsterdam [Amsterdam] (VU), Software Freedom Conservancy, Pennsylvania State University (Penn State), Penn State System, Seven Bridges genomics, Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB, Institut Pasteur [Paris] (IP)-Université Paris Cité (UPCité), University of Manchester [Manchester], University of Amsterdam [Amsterdam] (UvA), European Commission grants BioExcel-2 (SSR) H2020-IN-FRAEDI-02-2018 823830, BioExcel (SSR) H2020-EINFRA-2015-1 675728, EOSC-Life (SSR) H2020-INFRAEOSC-2018-2 824087, EOSCPilot (MRC) H2020-IN-FRADEV-2016-2 739563, IBISBA 1.0 (SSR) H2020-INFRAIA-2017-1-two-stage 730976, ELIXIR-EXCELERATE (SSR, HM) H2020-INFRADEV-1-2015-1 676559, ASTERICS (MRC) INFRADEV-4-2014-2015. ELIXIR the research infrastructure for life-science data, Interoperability Platform Implementation Study (MRC). 2018-CWL. Various universities have also co-sponsored this project. We thank Vrije Universiteit of Amsterdam, the Netherlands, where the first three authors have their primary affiliation., The CWL project is immensely grateful to the following self-identified CWL Community members and their contributions to the project: Miguel d’Arcangues Boland (Software, Bug Reports, Maintenance), Alain Domissy (Conceptualization, Answering Questions, Tools), Andrey Kislyuk (Software, Bug Reports), Brandi Davis-Dusenbery (Conceptualization, Funding acquisition, Investigation, Project Administration, Resources, Su- pervision, Business Development, Event Organizing, Talks), Niels Drost (Funding Acquisition, Blogposts, Event Organizing, Tutorials, Talks), Robert Finn (Data acquisition, Funding acquisition, Investigation, Resources, Su- pervision), Michael Franklin (Software, Bug Reports, Documentation, Event Organizing, Maintenance, Tools, Answering Questions, Talks), (), Manabu Ishii (Blogposts, Documentation, Examples, Event Organizing, Maintenance, Tools, Answering Questions, Translation, Tutorials, Talks), Sinisa Ivkovic (Software, Validation, Bug Reports, Tools), Alexander Kanitz (Software, Business Development, Tools, Talks), Sehrish Kanwal (Conceptualization, Formal Analysis, Investigation, Software, Validation, Bug Reports, Blogposts, Content, Event Organizing, Maintenance, Answering Questions, Tools, Tutorials, Talks, User Testing), Andrey Kartashov (Conceptualization, Software, Validation, Examples, Tools, Answering Ques tions), Farah Khan (Conceptualization, Formal Analysis, Funding Acquisition, Software), Michael Kotliar (Software, Validation, Bug Reports, Blogposts, Examples, Maintenance, Answering Questions, Reviewed Contri butions, Tools, Talks, User Testing), Folker Meyer (Tools), Rupert Nash (Software, Bug Reports, Talks, Videos), Maya Nedeljkovich (Software, Validation, Visualization, Writing -- review & editing, Bug Reports, Tools, Talks), Tazro Ohta (Formal Analysis, Funding Acquisition, Resources, Val idation, Bug Reports, Blogposts, Content, Documentation, Examples, Event Organizing, Answering Questions, Tools, Translation, Tutorials, Talks, User Testing), Pjotr Prins (Blogposts, Packaging, Bug Reports), Manvendra Singh (Software, Blogposts, Packaging, Tools, Reviewed Contributions), Andrey Tovchigrechko (Conceptualization, Software, Bug Reports), Alan Williams (Investigation), Denis Yuen (Software, Bug Reports, Documentation, Tools), Alexander (Sasha) Wait Zaranek (Conceptualization, Funding Acquisition), Sarah Wait Zaranek (Conceptualization, Funding Acquisition, Project Administration, Resources, Software, Accessibility, Bug Reports, Business Development, Content, Examples, Event Organizing, Answering Questions, Tutorials, Talks)., European Project: 823830,H2020-EU.1.4.1.3. Development, deployment and operation of ICT-based e-infrastructures, H2020-EU.1.4. EXCELLENT SCIENCE - Research Infrastructures ,BioExcel-2(2019), European Project: 675728,H2020,H2020-EINFRA-2015-1,BioExcel(2015), European Project: 824087,EOSC-Life, European Project: 676559,H2020,H2020-INFRADEV-1-2015-1,ELIXIR-EXCELERATE(2015), European Project: 653477,H2020,H2020-INFRADEV-1-2014-1,ASTERICS(2015)
Rok vydání: 2021
Předmět:
FOS: Computer and information sciences
Data flow architectures workflows
cs.DC
General Computer Science
Computer science
Bioinformatics
Population genetics
Enterprise computing infrastructures
Earth and atmospheric sciences
scientific workflows
• Computer systems organization → Distributed architectures
Reuse
CWL
Imaging
Set (abstract data type)
Computational biology
Software portability
SDG 17 - Partnerships for the Goals
Computational transcriptomics
Cloud computing
CCS CONCEPTS • Computing methodologies → Distributed computing methodologies
[INFO]Computer Science [cs]
computational data analysis
Transcriptomics
business.industry
Enterprise interoperability
Computational proteomics
• General and reference → Computing standards
Life and medical sciences
Computational genomics
Intervention (law)
Workflow
Computer Science - Distributed
Parallel
and Cluster Computing

RFCs and guidelines
• Applied computing → Astronomy
Grid computing
Cost control
standards
workflows
Distributed
Parallel
and Cluster Computing (cs.DC)

Software engineering
business
Systems biology
Zdroj: Crusoe, M R, Abeln, S, Iosup, A, Amstutz, P, Chilton, J, Tijanić, N, Ménager, H, Soiland-Reyes, S, Gavrilović, B, Goble, C A & The CWL Community 2022, ' Methods Included : Standardizing Computational Reuse and Portability with the Common Workflow Language ', Communications of the ACM, vol. 65, no. 6, pp. 54–63 . https://doi.org/10.48550/arXiv.2105.07028, https://doi.org/10.1145/3486897
Communications of the ACM, 65(6), 54-63. Association for Computing Machinery (ACM)
Communications of the ACM
Communications of the ACM, 2021, 65 (6), pp.54-63. ⟨10.1145/3486897⟩
Crusoe, M R, Abeln, S, Iosup, A, Amstutz, P, Chilton, J, Tijanić, N, Ménager, H, Soiland-Reyes, S & Goble, C 2022, ' Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language ', Communications of the ACM, vol. 65, no. 6, pp. 54-63 . https://doi.org/10.1145/3486897
ISSN: 0001-0782
1557-7317
DOI: 10.48550/arxiv.2105.07028
Popis: Computational Workflows are widely used in data analysis, enabling innovation and decision-making. In many domains (bioinformatics, image analysis, & radio astronomy) the analysis components are numerous and written in multiple different computer languages by third parties. However, many competing workflow systems exist, severely limiting portability of such workflows, thereby hindering the transfer of workflows between different systems, between different projects and different settings, leading to vendor lock-ins and limiting their generic re-usability. Here we present the Common Workflow Language (CWL) project which produces free and open standards for describing command-line tool based workflows. The CWL standards provide a common but reduced set of abstractions that are both used in practice and implemented in many popular workflow systems. The CWL language is declarative, which allows expressing computational workflows constructed from diverse software tools, executed each through their command-line interface. Being explicit about the runtime environment and any use of software containers enables portability and reuse. Workflows written according to the CWL standards are a reusable description of that analysis that are runnable on a diverse set of computing environments. These descriptions contain enough information for advanced optimization without additional input from workflow authors. The CWL standards support polylingual workflows, enabling portability and reuse of such workflows, easing for example scholarly publication, fulfilling regulatory requirements, collaboration in/between academic research and industry, while reducing implementation costs. CWL has been taken up by a wide variety of domains, and industries and support has been implemented in many major workflow systems.
8 pages, 3 figures. For the LaTex source code of this paper, see https://github.com/mr-c/cwl_methods_included
Databáze: OpenAIRE