Grammar sharing techniques for rule-based multilingual NLP systems

Autor: Santaholma, Marianne Elina
Jazyk: angličtina
Rok vydání: 2007
Předmět:
Zdroj: Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA) (2007)
Popis: Rule-based multilingual natural language processing (NLP) applications such as machine translation systems require the development of grammars for multiple languages. Grammar writing, however, is often a slow and laborious process. In this paper we describe a methodology for multilingual and multipurpose grammar development based on grammar sharing. This paper presents the first step towards a language independent core grammar used for recognition, analysis and generation of English, Japanese and Finnish used in a domain specific spoken language translation system. The paper focuses on the grammar architecture and rule writing principles. Evaluation on analysis and generation has shown that two thirds of the rules are shared between these three typologically different languages.
Databáze: OpenAIRE