alpag.net manual
Parser / Parser operation / Basic parsing
< Automaton | Handling alternative productions >

Basic parsing

Below is an example parser operation scenario including shift and reduce operations.

Following grammar will be used:

@1 STMT: a NT d;
@2 NT: b c;

Here is an automaton for above grammar. For each state list of actions is shown:

state #1
$accept: . STMT
actions
a shift to #3
STMT goto #2
state #2
$accept: STMT .
actions
$end reduce by @3 to $accept
state #3
STMT: a . NT d
actions
b shift to #5
NT goto #4
state #4
STMT: a NT . d
actions
d shift to #7
state #5
NT: b . c
actions
c shift to #6
state #6
NT: b c .
actions
d reduce by @2 to NT
state #7
STMT: a NT d .
actions
$end reduce by @1 to STMT

A schematic representation of automaton activity with regard to the grammar is shown on Fig. 12.

Fig. 12 Simple automaton for CFG

Automaton graph alone is not sufficient to understand parser operation. Stack contents must be also considered. Below is an example of parsing input text using above automaton:

Initial state is #1. Current state is always on top of the stack:

State #1
stack: #1
pending input: a b c d $end
Current position in grammar:
@1 ^ STMT
@2 STMT: ^ a NT d;

First symbol read is a. Action for symbol a in action table for state #1 is to shift and go to state #3.

State #3
stack: #1 a #3
pending input: b c d $end
Current position in grammar:
@1 STMT: a ^ NT d;
@2 NT: ^ b c;

Next input symbol is b. Action for symbol b in state #3 is to shift symbol to stack and go to state #5.

State #5
stack: #1 a #3 b #5
pending input: c d $end
Current position in grammar:
@2 NT: b ^ c;

Next symbol is c. Action for symbol c is to shift symbol to stack and go to state #6.

State #6
stack: #1 a #3 b #5 c #6
pending input: d $end
Current position in grammar:
@2 NT: b c ^;

There is only one action in state #6: to reduce by production @2 to nonterminal NT providing next input symbol is d. Since next symbol indeed is d parser performs reduction.

During reduction, top of stack elements matching the production are removed.

stack: #1 a #3 ( removed: b #5 c #6 )

Automaton transitions and stack operation are synchronized. When reduction is performed top-of-stack element configuration is guaranteed to match the production.

After reduction stack becomes:

State #6
stack: #1 a #3
pending nonterminal: NT (created from b and c removed from stack)
pending input: d $end
Current position in grammar:
@1 STMT: a ^ NT d;

At this moment information saved on parser stack about previously visited becomes important. Removal of terminals b and c exposed state #3 which is now top-of-stack an thus current. Without stack, parser would not be able to return to this state.

Recent reduction produced nonterminal NT which must be processed. Automaton performs transition under this nonterminal right away. The action in state #3 for NT is push that nonterminal on stack and go to state #4. After that configuration is:

State #4
stack: #1 a #3 NT #4
pending input: d $end
Current position in grammar:
STMT: a NT ^ d;

Next input symbol is d. Action for symbol d is to shift it and go to state #7.

State #7
stack: #1 a #3 NT #4 d #7 input: $end
Current position in grammar:
STMT: a NT d ^;

In state #7 the action is to reduce by production @1 to nonterminal STMT providing input stream has come to an end. Reduction first removes top of stack elements corresponding to the production:

State #1
stack: #1
pending nonterminal: STMT
pending input: $end
Current position in grammar:
^ STMT

Action for STMT in state #1 is to go to state #2:

State #2
stack: #1 STMT #2
pending input: $end
Current position in grammar:
STMT ^

Top level nonterminal of grammar STMT was successfully processed, and input stream reached the end. Parsing ends with success.

< Automaton | Handling alternative productions >
Alpag Manual