Shift-Reduce Constituent Parsing with Neural Lookahead Features

Autor: Jiangming Liu, Yue Zhang
Rok vydání: 2017
Předmět:
FOS: Computer and information sciences
Linguistics and Language
Computer science
Speech recognition
02 engineering and technology
010501 environmental sciences
computer.software_genre
Top-down parsing
01 natural sciences
Parser combinator
Artificial Intelligence
0202 electrical engineering
electronic engineering
information engineering

Leverage (statistics)
0105 earth and related environmental sciences
Computer Science - Computation and Language
Parsing
business.industry
Communication
Parsing expression grammar
Computer Science Applications
Human-Computer Interaction
TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES
020201 artificial intelligence & image processing
S-attributed grammar
Artificial intelligence
business
Computation and Language (cs.CL)
computer
Natural language processing
Sentence
Bottom-up parsing
Zdroj: Transactions of the Association for Computational Linguistics. 5:45-58
ISSN: 2307-387X
DOI: 10.1162/tacl_a_00045
Popis: Transition-based models can be fast and accurate for constituent parsing. Compared with chart-based models, they leverage richer features by extracting history information from a parser stack, which consists of a sequence of non-local constituents. On the other hand, during incremental parsing, constituent information on the right hand side of the current word is not utilized, which is a relative weakness of shift-reduce parsing. To address this limitation, we leverage a fast neural model to extract lookahead features. In particular, we build a bidirectional LSTM model, which leverages full sentence information to predict the hierarchy of constituents that each word starts and ends. The results are then passed to a strong transition-based constituent parser as lookahead features. The resulting parser gives 1.3% absolute improvement in WSJ and 2.3% in CTB compared to the baseline, giving the highest reported accuracies for fully-supervised parsing.
Databáze: OpenAIRE