A Comparison of Lithuanian Morphological Analyzers

Autor: Erika Rimkutė, Jurgita Kapočiūtė-Dzikienė, Loïc Boizou
Rok vydání: 2017
Předmět:
Zdroj: Text, Speech, and Dialogue ISBN: 9783319642055
TSD
DOI: 10.1007/978-3-319-64206-2_6
Popis: In this paper we present the comparative research work disclosing strengths and weaknesses of two the most popular and publicly available Lithuanian morphological analyzers, in particular, Lemuoklis and Semantika.lt. Their lemmatization, part-of-speech tagging, and fined-grained annotation of the morphological categories (as case, gender, tense, etc.) performance was evaluated on the morphologically annotated gold standard corpus composed of four domains, in particular, administrative, fiction, scientific and periodical texts. Semantika.lt significantly outperformed Lemuoklis by \(\sim \)1.7%, \(\sim \)2.5%, and \(\sim \)8.1% on the lemmatization, part-of-speech tagging, and fine-grained annotation tasks achieving \(\sim \)98.0%, \(\sim \)95.3% and, \(\sim \)86.8% of the accuracy, respectively.
Databáze: OpenAIRE