Developing semantic interoperability in ecosystem studies: semantic modelling and annotation for FAIR data production

Autor: Christian Pichot, Nicolas Beudez, Cécile Callou, André Chanzy, Alyssa Clavreul, Philippe Clastre, Benjamin Jaillet, François Lafolie, Jean-François Le Galliard, Chloé Martin, Florent Massol, Damien Maurice, Nicolas Moitrier, Ghislaine Monet, Hélène Raynal, Antoine Schellenberger, Rachid Yahiaoui
Rok vydání: 2022
DOI: 10.5194/egusphere-egu22-10213
Popis: The study of ecosystem characteristics and functioning requires multidisciplinary approaches and mobilises multiple research teams. Data are collected or computed in large quantity but are most often poorly standardised and therefore heterogeneous. In this context the development of semantic interoperability is a major challenge for the sharing and reuse of these data. This objective is implemented within the framework of the AnaEE (Analysis and Experimentation on Ecosystems) Research Infrastructure dedicated to experimentation on ecosystems and biodiversity. A distributed Information System (IS) is developed, based on the semantic interoperability of its components using common vocabularies (AnaeeThes thesaurus and OBOE-based ontology extended for disciplinary needs) for modelling observations and their experimental context. The modelling covers the measured variables, the different components of the experimental context, from sensor and plot to network. It consists in the atomic decomposition of the observations, identifying the observed entities, their characteristics and qualification, naming standards and measurement units. This modelling allows the semantic annotation of relational databases and flat files for the production of graph databases. A first pipeline is developed for the automation of the annotation process and the production of the semantic data, annotation that may represent a huge conceptual and practical work without such automation. A second pipeline is devoted to the exploitation of these semantic data through the generation i) of standardized GeoDCAT and ISO metadata records and ii) of data files (NetCDF format) from selected perimeters (experimental sites, years, experimental factors, measured variables...). Carried out on all the data generated by the experimental platforms, this practice will produce semantically interoperable data that meets the linked opendata standards. The work carried out contributes to the development and use of semantic vocabularies within the ecology research community. The genericity of the tools make them usable in different contexts of ontologies and databases.
Databáze: OpenAIRE