A hybrid disambiguation measure for inaccurate cultural heritage data
Autor: | Efremova, I., Ranjbar-Sahraei, B., Calders, T.G.K., Zervanou, K., Vertan, C. |
---|---|
Přispěvatelé: | Information Systems WSK&I, Process Science |
Jazyk: | angličtina |
Rok vydání: | 2014 |
Zdroj: | 14th Conference of the European Chapter Association for Computational Linguistics (EACL2014), April 26, 2014, Gothenburg, Sweden, 1-9 STARTPAGE=1;ENDPAGE=9;TITLE=14th Conference of the European Chapter Association for Computational Linguistics (EACL2014), April 26, 2014, Gothenburg, Sweden |
Popis: | Cultural heritage data is always associated with inaccurate information and different types of ambiguities. For instance, names of persons, occupations or places mentioned in historical documents are not standardized and contain numerous variations. This article examines in detail various existing similarity functions and proposes a hybrid technique for the following task: among the list of possible names, occupations and places extracted from historical documents, identify those that are variations of the same person name, occupation and place respectively. The performance of our method is evaluated on three manually constructed datasets and one public dataset in terms of precision, recall and F-measure. The results demonstrate that the hybrid technique outperforms current methods and allows to significantly improve the quality of cultural heritage data. |
Databáze: | OpenAIRE |
Externí odkaz: |