Automaton
Parser operation is based on automaton, which contains states corresponding to different places in grammar. Transitions between states are triggered by input terminal symbols.
Automaton is using additional data structure, a stack, which keeps trace of most recently processed symbols and visited states. When parser recognizes several consecutive input symbols as a production for some nonterminal, it removes them from stack and sets automaton to state corresponding to end of this nonterminal in some high-level grammar construct.
Parser during its operation performs two kinds of actions:
- shift, performed when terminal symbol is read from input and pushed to stack. Since this operation essentially shifts symbol from input stream to top of stack it is called 'shift'.
- reduce , executed after top-of-stack elements were recognized as valid production. This action replaces these elements with corresponding nonterminal. Action has two phases: first elements are removed from stack and a state from before shifting these elements is restored. Immediately after that automaton performs transition (goto) under newly recognized nonterminal
Each automaton state has an action table which tells what to do depending on next symbol read. Possible entries in that table are:
- shift, encoded as 'if next input symbol is T shift it to stack and go to next state S'
- reduce, encoded as: 'if next input symbol is T, leave it on input and perform reduction according to production P'.
Known length of production P tells how many elements should be removes from stack. After elements are removed, state found on top of stack becomes current state.
goto, encoded as 'goto next state S for nonterminal N' is performed immediately after reduction for a nonterminal being result of this reduction.
Examples given in further chapters illustrate basic parser operation scenarios.