Evaluating Grammaticality in Seq2seq Models with a Broad Coverage HPSG Grammar: A Case Study on Machine Translation
Autor: | Brendan O'Connor, Khiem Pham, Brian Dillon, Johnny Tian-Zheng Wei |
---|---|
Rok vydání: | 2018 |
Předmět: |
Head-driven phrase structure grammar
Sequence Grammar Machine translation Computer science business.industry media_common.quotation_subject 02 engineering and technology 010501 environmental sciences computer.software_genre 01 natural sciences Syntax 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Grammaticality Artificial intelligence business computer Natural language Natural language processing 0105 earth and related environmental sciences media_common |
Zdroj: | BlackboxNLP@EMNLP |
DOI: | 10.18653/v1/w18-5432 |
Popis: | Sequence to sequence (seq2seq) models are often employed in settings where the target output is natural language. However, the syntactic properties of the language generated from these models are not well understood. We explore whether such output belongs to a formal and realistic grammar, by employing the English Resource Grammar (ERG), a broad coverage, linguistically precise HPSG-based grammar of English. From a French to English parallel corpus, we analyze the parseability and grammatical constructions occurring in output from a seq2seq translation model. Over 93% of the model translations are parseable, suggesting that it learns to generate conforming to a grammar. The model has trouble learning the distribution of rarer syntactic rules, and we pinpoint several constructions that differentiate translations between the references and our model. |
Databáze: | OpenAIRE |
Externí odkaz: |