Developing a finite-state morphological analyzer for Urdu and Hindi

Autor: Bögel, Tina, Butt, Miriam, Hautli, Annette, Sulger, Sebastian
Jazyk: angličtina
Rok vydání: 2008
Předmět:
Druh dokumentu: InProceedings
Popis: We introduce and discuss a number of issues that arise in the process of building a finite-state morphological analyzer for Urdu, in particular issues with potential ambiguity and non-concatenative morphology. Our approach allows for an underlyingly similar treatment of both Urdu and Hindi via a cascade of finite-state transducers that transliterates the very different scripts into a common ASCII transcription system. As this transliteration system is based on the XFST tools that the Urdu/Hindi common morphological analyzer is also implemented in, no compatibility problems arise.
Databáze: Networked Digital Library of Theses & Dissertations