This section illustrates the process of parsing input using grammar defined in previous and shown on Fig. 3. Below on Fig. 5 the grammar is presented once again, this time with productions written on a separate line.
Parser table calculated with Alpag is shown on Fig. 6.
Note that some publications prefer to use two tables for PDA description: one for shift and reduce actions performed under terminal symbols and a separate one for gotos which are performed under nonterminals. Here a single table contains all the information.
State | Next symbol | Action | Next state | Reduce by production | Reduce to nonterminal |
#1 | A | shift | #3 | ||
SENTENCE | goto | #2 | |||
#2 | $eof | reduce | (0) | ||
#3 | Quick | shift | #5 | ||
Brown | shift | #6 | |||
ADJECTIVES | goto | #4 | |||
#4 | Fox | shift | #8 | ||
#5 | Brown | shift | #7 | ||
Fox | reduce | (2) | ADJECTIVES | ||
#6 | Fox | reduce | (3) | ADJECTIVES | |
#7 | Fox | reduce | (4) | ADJECTIVES | |
#8 | Jumps | shift | #10 | ||
MANYJUMPS | goto | #9 | |||
#9 | Over | shift | #11 | ||
And | shift | #12 | |||
#10 | Over | reduce | (5) | MANYJUMPS | |
And | reduce | (5) | MANYJUMPS | ||
#11 | A | shift | #14 | ||
#12 | Jumps | shift | #13 | ||
#13 | Over | reduce | (6) | MANYJUMPS | |
And | reduce | (6) | MANYJUMPS | ||
#14 | Dog | shift | #15 | ||
#15 | $eof | reduce | (1) | SENTENCE |
The input text being parsed is:
The text was chosen to show how LR(1) handles:
When analyzing process of parsing it is necessary to consider the grammar (Fig. 5), the automaton (Fig. 6) as well as current state of parser stack and remaining input which are changing from step to step.
For each parsing step shown are:
Selected grammar productions are shown with possible 'next' symbols (listed after slash '/'). These are symbols that are likely to appear next in input stream given the surrounding grammar context. Next possible symbols are calculated during parser table generation and converted to lookahead symbols for individual actions where applicable.
Below is a sequence of actions taken by parser. A short explanation of each step is included where applicable.
Starting position:
This is the starting state of automaton
Next symbol: A
Action: shift A to stack and go to state #3
Current position in production (1) before nonterminal ADJECTIVES expands also to starting positions in all productions for that nonterminal (2),(3),(4). At this moment all these production s are plausible alternatives.
In current state next possible terminal can be Quick or Brown. Action taken by parser depends on next terminal.
Next symbol: Quick
Action: shift Quick to stack and goto state #5
After reading Quick only productions (2) and (4) are likely choices.
If the parser was LR(0) it would be impossible to decide whether to reduce right now by production (2) or postpone reduction and read next token which could be Brown matching production (4).
Since the grammar is LR(1), grammar tables contain rules based on preview of next symbol. If next symbol is Fox reduction will be done by production (2). If next symbol is Brown parser will continue pushing symbols to the stack in accordance with production (4)
Next symbol: Fox
Action: reduce by production (2). Remove top of stack element Quick (which corresponds to right-hand-side of this production). It is replaced with nonterminal ADJECTIVES. State below removed element is #3. Parser then performs goto operation under nonterminal that was just placed on stack. Next state after goto is #4.
Currently the only possible next token is Fox (which was used as lookahead in previous step but was not read from input and remains as next input token)
Next symbol: Fox
Action: shift Fox to stack and go to state #8
Current position in production (1) ahead of MANYJUMPS expands also to beginnings of all productions for that nonterminal. Only possible terminal in current state is Jumps.
Next symbol: Jumps
Action: shift Jumps to stack and go to state #10
In current state the only possible action is to reduce by rule (5). Possible next tokens (lookaheads) are Over and And. Since for both these symbols decision is the same, analyzing lookahead symbol is not really necessary to make the decision.
A Full LR(1) parser will have explicit rules for both possible next symbols (Over, And) . A simplified parser may have one default catch-all rule causing reduction by production (5) regardless of the next symbol. Process of discarding lookahead information where it is not necessary to resolve ambiguities is called defaulting. If state #10 was defaulted, the only action in the state would become: 'reduce by production (5) regardless what the next symbol is'. When state is defaulted LR(1) parser behaves like a simple LR(0) no-lookahead parser in these states.
Next symbol: And
Action: reduce by production (5). Remove top of stack symbol Jumps and get back to state #8. Then go to state #9 pushing the nonterminal ‘MANYJUMPS’.
Next input token can be either Over or And. Decision is made after reading next token.
Next symbol: And
Action: Shift And to stack and go to state #12
Next symbol: Jumps
Action: Shift Jumps to stack and go to state #13
At the end of production the only possible action is reduction. Full LR(1) grammar contains explicit rules for all possible next-symbols ( Over, And). However since only one action is possible, there is no real need to analyze lookahead symbol.
An LR(1) parser implementation can be made to prefetch lookahead symbol always or only when necessary. Parsers generated by Alpag can be configured to act either way.
Next symbol: Over (which is not very relevant at the moment)
Action: reduce by production (6). Production has three elements on its right hand side: MANYJUMPS, And, Jumps which correspond to current top-of-stack configuration. Reduction removes these three symbols. Top of stack state becomes #8. Then parser goes to state #9 pushing MANYJUMPS to stack. Of course this newly pushed MANYJUMPS corresponds to left-hand-side of production (6). It is not the same as MANYJUMPS that was earlier on the stack in the same place.
Next symbol: Over
Action: Shift Over to stack and go to state #11
Next symbol: A
Action: Shift A to stack and go to state #14
Next symbol: Dog
Action: Shift Dog to stack and go to state #15.
End of input stream was reached (the lookahead symbol is $eof).
In current state the only action (for end-of-file) is to reduce by production (1).
Next symbol: $eof
Action: reduce by production (1). Remove seven elements from stack that correspond to right hand side of production (1). State on stack below this production is #1. In state #1 the parser pushes just reduced nonterminal SENTENCE and goes to next state #2.
The action in state #2 under symbol $eof is to recognize current top of stack nonterminal SENTENCE as valid input file which ends the parsing process with success.
Syntax tree of parsed text is shown on Fig. 7. It describes decomposition of all nonterminals into their elementary terminals and nonterminals.