11.1 ISTEX: An Innovative Scientific Repository for the French Research Community

Autor: Raymond Berard, Claire Francois, Laurent Schmitt, and Jean-Marie Pierrel
Jazyk: angličtina
Rok vydání: 2015
Předmět:
DOI: 10.5281/zenodo.3603203
Popis: ISTEX is a project launched in 2012 and funded by the National Agency for Research (ANR). It is part of the ‘Investments for the future’ programme initiated by the French Ministry for Higher Education and Research. The two main objectives of ISTEX are to acquire retrospective collections of scientific publications (articles, monographs, etc.) and to set up a platform that will host all the data and offer advanced services of research and delivery. This paper will focus on the second aspect of ISTEX: how to build a normalised, uniform and enriched data repository that will support scientific research. In addition to providing search engine services across articles and collections, and full-text indexing, the ISTEX team is currently working towards offering three types of data enrichment: terminology extraction processes for terms and their variants; named entity recognition; and cited bibliographic reference identification from unstructured full-texts. Moreover, several sub-projects have been founded with the aim of showing how labs could use ISTEX resources in the framework of their surveys or research. Firstly, the CILLEX project, led by CLLE in Toulouse, aims to develop metrological tools founded on the structures of small world networks, which are omnipresent in the documentary bases, in order better to identify the pertinent information. Secondly, the ISTEX-R project, carried out by LORIA, ATILF and the INIST, targets the creation of tools for access to textual content, to build on and capitalise knowledge in a given scientific domain. It aims to complete the basic platform by a content analysis, to characterise the evolution of research and knowledge with time. Finally, the LorExplor project proposes to constitute an open source library of Xml components to build research systems, corpus exploration servers and data curation chains to reply to specific needs (special survey, bibliography, meta-analysis, etc.). In this paper we will describe the three levels of the ISTEX platform: the basic treatments, the enrichment processes and the research working sub-projects included in the ISTEX framework. Our main objective is to show how this French initiative is a ‘win win’ step for both libraries and research. Laurent Schmittis a research engineer at the French National Research Centre (CNRS), and heads the ‘Projects and Innovation’ department at INIST-CNRS, which facilitates access to scientific results from all fields of world research, promotes scientific production and provides services to people in Higher Education and Research in France. He is also in charge of ISTEX work packages dedicated to INIST: the development of the ISTEX platform which will host, enrich and disseminate all of the acquired data. He is dual trained as computer scientist and also in natural language computing.
Databáze: OpenAIRE