eddy4R: A community-extensible processing, analysis and modeling framework for eddy-covariance data based on R, Git, Docker and HDF5
Autor: | Andrei Serafimovich, Hongyan Luo, David Durden, Natchaya Pingintha-Durden, Stefan Metzger, Ankur R. Desai, Jörg Hartmann, Ke Xu, Torsten Sachs, Cove Sturtevant, Jiahong Li |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Data processing 010504 meteorology & atmospheric sciences Database business.industry Computer science Distributed computing computer.file_format Modular design Hierarchical Data Format computer.software_genre 01 natural sciences 03 medical and health sciences Consistency (database systems) 030104 developmental biology Software Scalability DevOps business Host (network) computer 0105 earth and related environmental sciences |
DOI: | 10.5194/gmd-2016-318 |
Popis: | This study presents the systematic development of an open-source, flexible and modular eddy-covariance (EC) data processing framework. This is achieved through adopting a Development and Systems Operation (DevOps) philosophy, building on the eddy4R family of EC code packages in the R Language for Statistical Computing as foundation. These packages are community-developed via the GitHub distributed version control system and wrapped into a portable and reproducible Docker filesystem that is independent of the underlying host operating system. The HDF5 hierarchical data format then provides a streamlined mechanism for highly compressed and fully self-documented data ingest and output. This framework is applicable beyond EC, and more generally builds the capacity to deploy complex algorithms developed by scientists in an efficient and scalable manner. In addition, modularity permits meeting project milestones while retaining extensibility with time. The efficiency and consistency of this framework is demonstrated in the form of three application examples. These include tower EC data from first instruments installed at a National Ecological Observatory (NEON) field site, aircraft flux measurements in combination with remote sensing data, as well as a software intercomparison. In conjunction with this study, the first two eddy4R packages and simple NEON EC data products are released publicly. While this proof-of-concept represents a significant advance, substantial work remains to arrive at the automated framework needed for the streaming generation of science-grade EC fluxes. |
Databáze: | OpenAIRE |
Externí odkaz: |