Popis: |
Data in humanities, especially historical data, is characterized by a strong presence of vague information and uncertainty. The available Content Management Systems and annotation tools have often disregarded the requirements of research projects dealing with fuzzy data, languages with non-concatenative morphologies and scripts of non-Latin writing systems. Additionally, data encoding standards often overstress the importance of mere standardization at the expense of human readability and efficiency in terms of storage and parsing performance. Similarly, morphological tag sets and natural language processing frameworks primarily based on Indo-European languages are presented as universal solutions, but fail to meet some of the linguistic phenomena characteristic of other languages. |