Reuse or New Development: sustainability of resources and tools for multi-facetted historical data and languages

Autor: Vertan, Cristina, González, Alicia, Verkinderen, Peter
Rok vydání: 2016
DOI: 10.5281/zenodo.160375
Popis: Data in humanities, especially historical data, is characterized by a strong presence of vague information and uncertainty. The available Content Management Systems and annotation tools have often disregarded the requirements of research projects dealing with fuzzy data, languages with non-concatenative morphologies and scripts of non-Latin writing systems. Additionally, data encoding standards often overstress the importance of mere standardization at the expense of human readability and efficiency in terms of storage and parsing performance. Similarly, morphological tag sets and natural language processing frameworks primarily based on Indo-European languages are presented as universal solutions, but fail to meet some of the linguistic phenomena characteristic of other languages.
Databáze: OpenAIRE