Designing HMM‐Based Part‐of‐Speech Tagger for Lithuanian Language

Autor:	Gailius Raškinis, Jan Kuper, Vilma Griciūtė, Giedrė Pajarskaitė
Rok vydání:	2004
Předmět:	Computer science business.industry Computer Science::Information Retrieval Applied Mathematics Speech recognition Supervised learning Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) Context (language use) Lithuanian computer.software_genre Viterbi algorithm Part of speech language.human_language Set (abstract data type) symbols.namesake language symbols Artificial intelligence Hidden Markov model business computer Word (computer architecture) Natural language processing Information Systems
Zdroj:	Informatica. 15:231-242
ISSN:	1822-8844 0868-4952
Popis:	This paper describes a preliminary experiment in designing a Hidden Markov Model (HMM)-based part-of-speech tagger for the Lithuanian language. Part-of-speech tagging is the problem of assigning to each word of a text the proper tag in its context of appearance. It is accomplished in two basic steps: morphological analysis and disambiguation. In this paper, we focus on the problem of disambiguation, i.e., on the problem of choosing the correct tag for each word in the context of a set of possible tags. We constructed a stochastic disambiguation algorithm, based on supervised learning techniques, to learn hidden Markov model's parameters from hand-annotated corpora. The Viterbi algorithm is used to assign the most probable tag to each word in the text.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::8ecdbceda0ca1682977d7c5d13e03194 https://doi.org/10.15388/informatica.2004.056 Zobrazit plný text záznamu Plný text ve formátu PDF