POS Tagging of Hungarian with Combined Statistical and Rule-Based Methods
Autor: | András Hócza, János Csirik, András Kuba |
---|---|
Rok vydání: | 2004 |
Předmět: | |
Zdroj: | Text, Speech and Dialogue ISBN: 9783540230496 TSD |
DOI: | 10.1007/978-3-540-30120-2_15 |
Popis: | In this paper we will survey the key results achieved so far in Hungarian POS tagging. The most successful approaches have been selected and re-evaluated on a manually annotated corpus containing 1.2 million words. Tests were performed on single-domain, multiple domain and cross-domain test settings. We investigate here the possibilities of further improvement of the selected POS tagging methods by combining them. Our aim is to build a POS tagger that achieves good results on a fine tag set of more than 1000 tags. |
Databáze: | OpenAIRE |
Externí odkaz: |