Improving Semantic Parsing with Enriched Synchronous Context-Free Grammars in Statistical Machine Translation
Autor: | Junhui Li, Guodong Zhou, Wei Lu, Muhua Zhu |
---|---|
Rok vydání: | 2016 |
Předmět: |
Parsing
General Computer Science Machine translation business.industry Computer science 02 engineering and technology computer.software_genre Top-down parsing 03 medical and health sciences TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES 0302 clinical medicine Rule-based machine translation 030221 ophthalmology & optometry 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Top-down parsing language Synchronous context-free grammar S-attributed grammar Artificial intelligence business computer Natural language processing Bottom-up parsing |
Zdroj: | ACM Transactions on Asian and Low-Resource Language Information Processing. 16:1-24 |
ISSN: | 2375-4702 2375-4699 |
DOI: | 10.1145/2963099 |
Popis: | Semantic parsing maps a sentence in natural language into a structured meaning representation. Previous studies show that semantic parsing with synchronous context-free grammars (SCFGs) achieves favorable performance over most other alternatives. Motivated by the observation that the performance of semantic parsing with SCFGs is closely tied to the translation rules, this article explores to extend translation rules with high quality and increased coverage in three ways. First, we examine the difference between word alignments for semantic parsing and statistical machine translation (SMT) to better adapt word alignment in SMT to semantic parsing. Second, we introduce both structure and syntax informed nonterminals, better guiding the parsing in favor of well-formed structure, instead of using a uninformed nonterminal in SCFGs. Third, we address the unknown word translation issue via synthetic translation rules. Last but not least, we use a filtering approach to improve performance via predicting answer type. Evaluation on the standard GeoQuery benchmark dataset shows that our approach greatly outperforms the state of the art across various languages, including English, Chinese, Thai, German, and Greek. |
Databáze: | OpenAIRE |
Externí odkaz: |