Using lookahead symbols to resolve ambiguity is the essence of LR(1) algorithm (the '1' stands for one lookahead symbol). There are however situations when analyzing lookahead is not necessary.
Reduction does not require reading next input symbol since it modifies elements already on stack. If a state contains only a single possible action, reading next symbol is not necessary to choose between available actions. That's why states with single reduction can be processed in a simplified way, without fetching lookahead symbol.
Replacing multiple actions with a single default action is called 'defaulting'. A state that was simplified this way is called a defaulted state. Defaulting is a method of decreasing parser table sizes and is done for optimization reasons only.
If input data is valid, parser behavior is the same whether defaulting is used or not. However if input data contains errors, parser behavior can be different when defaulting is used.
This problem will be illustrated using following grammar:
Below is an automaton for the grammar. The automaton was built without defaulting (all states have full lookahead tables). Only states relevant to analyzed scenario are shown:
In above grammar state #5 corresponds to end of production WHO_WHAT:
WHO_WHAT: Mary had ^
State #5 has just one action defined, a reduction by production @2. It is a potential candidate for defaulting.
When not configured explicitly LR(1) parser generated by Alpag does not use defaulting. All states have action tables with explicit entries for various lookahead symbols
Action table for state #5 contains two entries:
During parser construction list of all terminals that can possibly appear next is calculated for each state. In state #5 after 'Mary had' next possible terminal symbols are lamb or goat. That's why action table for state #5 contains two explicit entries for these symbols.
When parser reaches state #5 a lookahead symbol is fetched and compared against action table. If lookahead symbol is lamb or goat reduction by production 2 is done. For other symbols an error is reported.
For example if input sequence is 'Mary had Mary' then automaton upon reaching state #5 fetches 'Mary' as lookahead symbol. Since this symbol has no action defined in state #5 parsing fails.
State #5 contains only one action and can be defaulted.
When defaulting is on, definition of state #5 changes as follows:
A single catch-all action is defined for all possible terminals. When parser reaches state #5 it does not fetch a lookahead symbol. It simply performs a reduction regardless what next symbol is.
Assuming that input text is 'Mary had Mary' parser steps are:
Do not fetch lookahead. Use default rule and perform reduction by production @2 to WHO_WHAT:
State #3 has multiple actions for different terminals, so lookahead symbol is necessary. Read lookahead symbol:
There is no action for terminal Mary in state #3. Parsing fails
If input contains no errors (conforms to grammar) then parser behavior is the same with and without defaulting. When input contains an error, exact moment this error is detected may be different with defaulting on or off.
Consider following grammar which matches input sequence 'a b c d':
If input sequence is 'a b c z' and current parser state is immediately after 'c' like this:
then:
Alpag provides reduction preview mechanism which enables both defaulting and early error detection.
When option Parser.Reduction.PreviewCheck is on, parser performs simulated reductions before performing actual reductions.
When parser is in defaulted state it cannot verify next input symbol. It performs simulated reduction and gets to the after-reduction state. If this state is also defaulted parser performs the procedure again, until non-defaulted state is found. Then it verifies next input symbol against action table of this non-defaulted state. If result of verification is positive parser gets back to starting point and performs reductions once again, this time 'for real'. If result is negative parser gets back to starting point and reports error there. Error is reported as if defaulting was not used.
Reduction preview mechanism introduces processing overhead though is not prohibitively costly. Note that mechanism is activated only in defaulted states. For non-defaulted states parser works as usual.