Autor: |
Bögel, Tina, Butt, Miriam, Hautli, Annette, Sulger, Sebastian |
Jazyk: |
angličtina |
Rok vydání: |
2008 |
Předmět: |
|
Druh dokumentu: |
InProceedings |
Popis: |
We introduce and discuss a number of issues that arise in the process of building a finite-state morphological analyzer for Urdu, in particular issues with potential ambiguity and non-concatenative morphology. Our approach allows for an underlyingly similar treatment of both Urdu and Hindi via a cascade of finite-state transducers that transliterates the very different scripts into a common ASCII transcription system. As this transliteration system is based on the XFST tools that the Urdu/Hindi common morphological analyzer is also implemented in, no compatibility problems arise. |
Databáze: |
Networked Digital Library of Theses & Dissertations |
Externí odkaz: |
|