Recursion

Recursion occurs when grammar production for a nonterminal references this nonterminal also on right-hand-side, either directly, or via other intermediate productions.

Context Free Grammar productions have fixed number of right-hand-side elements. To describe variable length sequence of elements recursion must be used.

If consecutive elements of sequence are to appear with no separators between them (should 'touch' each other) then recursive nonterminal must be placed either on very start or very end of recursive production. Such construction is called left- or right- recursion respectively.

This chapter describes parser operation with left and right recursion employed to process a sequence of elements. Language considered is a sequence of symbols a.

Left recursive grammar matching this language is:

@1 L: L a
@2 L: a;

Right recursive grammar for the same language is:

@1 R: a R;
@2 R: a;

Example expansion of both these grammars is shown on Fig. 14.

Fig. 14 Left and right recursion

Left recursion

Automaton for left recursive grammar is:

state #1
$accept: . L
actions
a shift to #3
L goto #2
state #2
$accept: L .
L: L . a
actions
a shift to #4
$end reduce by @3 to $accept
state #3
L: a .
actions
$end reduce by @2 to L
a reduce by @2 to L
state #4
L: L a .
actions
$end reduce by @1 to L
a reduce by @1 to L

Input sequence being parsed is a sequence of three symbols a:

a a a.

Since grammar is left recursive, this input sequence will be parsed as:

@1 L: ( @1 L: ( @2 L: a ) a ) a

To better illustrate parsing process each nonterminal stored on stack is indexed (L1, L2…) and is shown together with its source elements. Note that in reality such information is not present on stack.

Initial state:

State #1

stack: #1

pending input: a a a $end

Current position in grammar:

^ L

Action for next input symbol a is to shift and go to state is #3:

State #3

stack: #1 a #3

pending input: a a $end

Current position in grammar:

@2 L: a ^;

In state #3 action is to reduce by production @2 if next input token is a:

State #2

stack: #1 L1:a #2

pending input: a a $end

Current position in grammar:

@1 L: L ^ a

In state #2 shift next input symbol a and go to state #4.

State #4

stack: #1 L1:a #2 a #4

pending input: a $end

@2 L: a ^;

In state #4 action is to reduce by production @1:

State #2

stack: #1 L2:(L1:a) a #2

pending input: a $end

@1 L: L ^ a

Shift last a.

State #4

stack: #1 L2:(L1:a) a #2 a #4

pending input: $end

Current position in grammar:

@1 L: L ^ a

and reduce by production @1:

State #2

stack: #1 L3:(L2:(L1:a) a) a #2

pending input: $end

Current position in grammar:

@1 L: L ^ a

Since end of input was reached final state is accepting state. Parsing ends

Right recursion

Automaton for right recursive grammar is:

state #1
$accept: . R
actions
a shift to #3
R goto #2
state #2
$accept: R .
actions
$end reduce by @3 to $accept
state #3
R: a . R
R: a .
actions
a shift to #3
$end reduce by @2 to R
R goto #4
state #4
R: a R .
actions
$end reduce by @1 to R

Input sequence being parsed is same as in previous example:

a a a.

Grammar is right recursive, so input sequence will be parsed as:

@1 R: a (@1 R: a (@2 R: a) )

Initial state:

State #1

stack: #1

pending input: a a a $end

Current position in grammar:

^ R

First action is to shift a and go to state #3:

State #3

stack: #1 a #3

pending input: a a $end

Current position in grammar:

R: a ^ R;

R: a ^;

In state #3 action is to shift another a and return to the same state:

State #3

stack: #1 a #3 a #3

pending input: a $end

Current position in grammar:

R: a ^ R;

R: a ^;

For third a action is the same

State #3

stack: #1 a #3 a #3 a #3

pending input: $end

Current position in grammar:

R: a ^ R;

R: a ^;

Since no more input remains next action is to reduce by production @2 to R:

State #4

stack: #1 a #3 a #3 R:a #4

pending input: $end

Current position in grammar:

R: a R ^;

In state #4 next action is again to reduce:

State #4

stack: #1 a #3 R:a (R:a) #4

pending input: $end

Current position in grammar:

R: a R ^;

And once again:

State #2

stack: #1 R:a (R:a (R:a)) #2

pending input: $end

Current position in grammar:

R: a R ^;

Parser is accepting state. Parsing ends

Conclusions

LR(1) parsers work in bottom-up manner processing input from left to right.

When grammar is left recursive bottom-level productions appear first in input stream and can be completely reduced even before upper-level productions. As a result stack does not grow regardless of the depth of recursion.

When grammar is right recursive all productions are aligned with far end of input sequence. Until this end is reached, no production can be reduced. Input terminals being elements of these productions must be preserved which causes the stack to grow. Once end of input has been reached a chain of reductions occurs, collecting elements from stack, and wrapping them in nonterminals in bottom-up order. This continues until all elements on stack has been reduced to a single topmost nonterminal.

In either scenario reductions are fired in bottom-up order. In left-recursive variant reductions for low level productions are executed even before further input terminals has been read. In right-recursive scenario reductions are fired after all terminals has been successfully loaded.

Potential unlimited stack growth possible with right recursion should be avoided for performance and security reasons. In scenarios where either left or right recursion can be used, left should be preferred.

< Handling alternative productions | Lookaheads and defaulting >

Alpag Manual