Morphological analysis and disambiguation for Breton

Autor: Nick Howell, Francis M. Tyers
Rok vydání: 2020
Předmět:
Zdroj: Language Resources and Evaluation. 55:431-473
ISSN: 1574-0218
1574-020X
DOI: 10.1007/s10579-020-09510-8
Popis: In this paper we present an extended description of two resources for natural language processing of Breton, a morphological analyser and constraint grammar-based disambiguator. The constraint grammar was developed using a novel methodology by a linguist and a language consultant creating rules to solve specific errors in disambiguation in a machine translation system. In addition we introduce a new morphologically-disambiguated corpus of Breton and evaluate both the morphological analyser and constraint grammar for coverage and accuracy. For comparison we use the same corpus to train several reference systems for part-of-speech tagging and lemmatisation and compare the performance. The experiments show that our system outperforms the reference systems by a wide margin when the reference systems are trained without an external full-form list, and performs comparably when they are trained with a full-form list generated from our morphological analyser.
Databáze: OpenAIRE