Problémy automatické morfologické disambiguace češtiny = Problems of automatic morphological disambiguation of Czech.

Autor: Petkevič, Vladimír, 1954-
Jazyk: čeština
Předmět:
Druh dokumentu: Non-fiction
ISSN: 0027-8203
Abstrakt: Abstract: The article focuses on some of the main problems in the current automatic morphological disambiguation of Czech. Following a description of the disambiguation methods used for disambiguating Czech texts and of their accuracy, the author discusses the main reasons why the correct morphological disambiguation of Czech texts contained in the corpora of the SYN series of the Czech National Corpus project is very difficult to achieve, and why, notwithstanding can improvement in disambiguation (e.g. the SYN2013PUB corpus is tagged in a better way than the SYN2000 corpus), there is still a lot of work to be accomplished. The author concentrates exclusively on the problems of rule-based disambiguation rather than on the stochastic one, trying to identify areas where disambiguation could be improved in the future. The necessity of a reliable disambiguation of Czech texts as a key prerequisite for their successful subsequent syntactic analysis is also stressed.
Databáze: Katalog Knihovny AV ČR