eddy4R: A community-extensible processing, analysis and modeling framework for eddy-covariance data based on R, Git, Docker and HDF5

Autor: Andrei Serafimovich, Hongyan Luo, David Durden, Natchaya Pingintha-Durden, Stefan Metzger, Ankur R. Desai, Jörg Hartmann, Ke Xu, Torsten Sachs, Cove Sturtevant, Jiahong Li
Rok vydání: 2017
Předmět:
DOI: 10.5194/gmd-2016-318
Popis: This study presents the systematic development of an open-source, flexible and modular eddy-covariance (EC) data processing framework. This is achieved through adopting a Development and Systems Operation (DevOps) philosophy, building on the eddy4R family of EC code packages in the R Language for Statistical Computing as foundation. These packages are community-developed via the GitHub distributed version control system and wrapped into a portable and reproducible Docker filesystem that is independent of the underlying host operating system. The HDF5 hierarchical data format then provides a streamlined mechanism for highly compressed and fully self-documented data ingest and output. This framework is applicable beyond EC, and more generally builds the capacity to deploy complex algorithms developed by scientists in an efficient and scalable manner. In addition, modularity permits meeting project milestones while retaining extensibility with time. The efficiency and consistency of this framework is demonstrated in the form of three application examples. These include tower EC data from first instruments installed at a National Ecological Observatory (NEON) field site, aircraft flux measurements in combination with remote sensing data, as well as a software intercomparison. In conjunction with this study, the first two eddy4R packages and simple NEON EC data products are released publicly. While this proof-of-concept represents a significant advance, substantial work remains to arrive at the automated framework needed for the streaming generation of science-grade EC fluxes.
Databáze: OpenAIRE