Fast GLR Parsers for Extended BNF Grammars and Transition Networks
Autor: | Angelo Morzenti, Stefano Crespi Reghizzi, Luca Breveglieri, Angelo Borsotti |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
regular right-hand side grammar
graph-structured stack Theoretical computer science Finite-state machine Parsing Syntax (programming languages) Computer Networks and Communications Computer science SPPF GLR parsing Data structure computer.software_genre shared packed parse forest parser performance comparison Human-Computer Interaction TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES Rule-based machine translation infinite ambiguity Memory footprint Regular expression GSS computer Time complexity Software |
Popis: | The Tomita’s Generalized LR ( 1 ) parsing algorithm (GLR), later improved in many ways, runs in a linear time on LR ( 1 ) grammars, and degrades to a polynomial-time bound if the grammar is not deterministic. We address a useful feature not present in the current GLR ( 1 ) methods: the ability to accept grammars of the Extended BNF type (EBNF), the rules of which contain regular expressions. An EBNF grammar is conveniently represented by a collection of finite automata called a Transition Net (TN). We define, analyze and evaluate a new GLR ( 1 ) algorithm, called GELR, that combines the recent LR ( 1 ) parsing algorithm for TNs with the classical GLR data structures: the Graph-Structured Stack representing multiple stacks, and the Shared Packed Parse Forest for multiple syntax trees. The GELR algorithm is proved correct and an efficient implementation incorporating the state-of-the-art Right-Nulled parsing optimization is available. Experimental measures of the GELR parser size, speed and memory footprint are reported for current programming and web languages, and are compared with those of other parsing algorithms. The findings prove that directly parsing EBNF grammars does not penalize speed. Performance comparisons for different computer languages should also be of interest. |
Databáze: | OpenAIRE |
Externí odkaz: |