Reproducible Containers
Autor: | Omar Navarro Leija, Joseph Devietti, Ryan G. Scott, Baojun Wang, Nicholas Renner, Ryan R. Newton, Kelly Shiptoski |
---|---|
Rok vydání: | 2020 |
Předmět: |
Source lines of code
business.industry Computer science 020207 software engineering 02 engineering and technology computer.software_genre Replication (computing) Workflow Software 020204 information systems Container (abstract data type) 0202 electrical engineering electronic engineering information engineering User space Operating system Overhead (computing) business computer Abstraction (linguistics) |
Zdroj: | Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. |
Popis: | We describe the design and implementation of DetTrace, a reproducible container abstraction for Linux implemented in user space. All computation that occurs inside a DetTrace container is a pure function of the initial filesystem state of the container. Reproducible containers can be used for a variety of purposes, including replication for fault-tolerance, reproducible software builds and reproducible data analytics. We use DetTrace to achieve, in an automatic fashion, reproducibility for 12,130 Debian package builds, containing over 800 million lines of code, as well as bioinformatics and machine learning workflows. We show that, while software in each of these domains is initially irreproducible, DetTrace brings reproducibility without requiring any hardware, OS or application changes. DetTrace's performance is dictated by the frequency of system calls: IO-intensive software builds have an average overhead of 3.49x, while a compute-bound bioinformatics workflow is under 2%. |
Databáze: | OpenAIRE |
Externí odkaz: |