alpag.net manual
Parser / Conflicts / Resolving conflicts
< Conflict context | Managing conflicts >

Resolving conflicts

Usually when a new grammar is defined and compiled for the first time, multiple conflicts can appear.

A single conflict is reported for:

Conflicts are not necessarily unrelated. Often multiple reported conflicts are caused by single issue in the grammar which causes ambiguities in multiple states.

Although conflicts are reported for automaton states, user must solve them making corrections to the source grammar. It is thus important to understand the relationship between grammar and automaton built from this grammar.

When a conflict occurs user can:

There are several types of tweaks or hints that can be used to resolve conflicts. These include:

Once conflict is manually resolved, it is not reported any more during parser generation. Sometimes the way conflict was resolved may be inappropriate, so conflict warning is silenced but the problem remains. It may be thus necessary to inspect conflicts again, including resolved ones.

All conflicts, resolved or not, are always listed in parser report.

Global settings

Several options control conflict resolution on global level. These options have usually three possible settings:

Global options are:

Parser.Conflicts.SR_ResolveAsShift

When enabled, all shift-reduce are resolved as shift by default. If particular conflict has any token-specific or production-specific conflict resolution settings, then these settings take precedence over this global option (so this option applies only to conflicts not resolved by other means).

Note that some LR parser generators by default assume that s-r conflicts should be resolved as shift. Alpag does not make such assumption by default. Above option has to be explicitly enabled to get such behavior.

Parser.Conflicts.RR_ResolveByRuleOrder

When enabled, all reduce-reduce conflicts are resolved by order of rules in grammar (rule defined earlier wins).

Note that this option is very dangerous. It should not be used in most cases. It was provided solely for the sake of completeness.

Parser.Conflicts.SR_ResolveByPrecedence

This option enables terminal and production precedence rules. It is on by default. Switching it to off disables all precedence declarations anywhere in grammar.

Set this option to YesWarn to resolve conflicts but keep generating warnings.

Parser.Conflicts.SR_ResolveByAssoc

This option enables resolving conflicts using terminal symbol associativity. It is on by default. When this option is off, terminal symbol associativity declarations like %left or %right are not used to resolve conflicts.

Disabling this option has no effect on %nonassoc declarations, which can be used to implicitly generate a conflict.

Set this option to YesWarn to resolve conflicts but keep generating warnings.

Parser.Conflicts.RR_ResolveIdenticalByOrderInRule

This option is used to resolve reduce-reduce conflicts between identical productions that differ only by their lookahead context. When set, production defined earlier in file wins.

Normally it is illegal to define grammar with multiple identical productions. It can be done in Alpag providing that their lookahead contexts are different. If defined lookahead contexts overlap, a reduce-reduce conflict is generated. It can be resolved using rule order when above option is on. Note that using nonoverlapping contexts is better though.

Parser.Conflicts.RR_ResolveIndenticalPreferLookahead

This option is used in situations, when multiple identical productions are defined with different lookahead context plus one additional production with no lookahead. When option is on, productions with explicit lookahead contexts are preferred over 'naked' production. Production without lookahead context applies only to those terminal symbols that are not explicitly listed in any lookahead list.

Precedence

The idea of using precedence to resolve shift-reduce conflicts has its roots in math, where multiplication operator '*' has higher precedence that addition '+' so expression like 2+3*5 is interpreted as 2+(3*5).

Grammar outline

Concept of precedence is associated with terminal symbols. It can be used to compare them for priority. The general scenario where precedence mechanism applies is as follows:

@1 ... A SOME_OPTIONAL_NTS_ALLOWED ^ // ready to reduce
@2 ... A SOME_OPTIONAL_NTS_ALLOWED ^ B // must shift B

In certain state parser must choose between reducing by production @1 and shifting B to go for production @2.

Assuming that terminals A and B have different precedences parser will:

Note that comparison is made between:

There can be some nonterminals in either production after last terminal A, here SOME_OPTIONAL_NTS_ALLOWED.

Example

Assuming that terminal symbols are * and + (with * having higher precedence than +) presented mechanism behaves as follows:

expr: mul | add | ...
@1 mul: expr * expr
@2 add: expr + expr

Stack situation:

expr * expr ^ + // e.g. 2*3 +

Elements on stack can be reduced by production @1 (multiplication). Alternatively + can be shifted to go for production @2 (addition). By using precedence rules multiplication wins and reduction occurs:

mul ^ + // (2*3) +

assuming that next comes another expression further parser actions are:

mul + expr ^ // (2*3) + 4, reduce by production @2
add ^ // ((2*3)+4)

Alternative situation is:

expr + expr ^ * // e.g. 5+6*

Elements on stack can be reduced by production @2 (addition). Alternatively * can be shifted to go for production @1 (multiplication). Using precedence, multiplication wins and shift occurs:

expr + expr * ^ // still 5+6*

assuming that next comes another expression, further parser actions are:

expr + expr * expr ^ // 5+6*7, reduce by production @1
expr + mul ^ // 5+ (6*7), reduce by production @2
add ^ // (5+(6*7))

Note that precedence is a permanent property of a terminal symbol. A terminal with higher precedence wins s-r conflicts over lower-precedence terminals everywhere in grammar.

Declaring precedence

Precedence is not assigned to terminal symbols directly. Instead terminal symbols are declared as members of a particular precedence class. Precedence classes are part of global order. Classes defined further in the file have higher precedence.

Precedence classes can be declared as:

It is possible to change precedence of single grammar production using %prec keyword as follows:

lhsNt: elt1 elt2 ... %prec TERM

If above production appears in any s-r conflict as candidate for reduction then Alpag will not use precedence of last terminal of this production. Instead precedence of TERM terminal will be used. The TERM can by any terminal defined in the grammar.

Note that the precedence of TERM terminal will be used when resolving all s-r conflicts with this production everywhere in the grammar.

Mechanism of precedence has higher priority that mechanism of associativity.

Associativity

Associativity is used to resolve shift-reduce conflicts when both reduction and shift are done in context of the same terminal or terminals in the same precedence class. A typical scenario for this type of conflict is a sequence of binary operations like 2+3+4. This can be interpreted as either (2+3)+4 or 2+(3+4). In conventional math, where addition is associative, result of both above expressions is the same. When construction like this appears in parser grammar it must be explicitly resolved as either left or right associative.

Grammar outline

When grammar contains one or more terminals with the same precedence that can appear in chains, shift-reduce conflict can appear. Example grammar with such problems is:

expr: expr OP expr; // this production alone causes the problem

Given the input sequence as follows:

expr OP expr OP expr

A problem arises when parser reaches state:

expr OP expr ^ OP

At this point parser can either reduce or shift OP.

Without associativity above shift-reduce conflict would be resolved using default configured s-r behavior.

If terminal OP is declared as left-associative by using %left keyword a reduction will be chosen. Entire scenario is:

expr OP expr ^ OP // e.g. 2+3, pending '+'; do not shift OP '+', reduce
expr ^ OP // (2+3) pending '+'; now shift OP '+'
expr OP ^ // (2+3)+, fetch next token …
expr OP expr ^ // (2+3)+4, reduce
expr ^ // ((2+3)+4)

If terminal is declared right-associative using %right keyword then shift will be shosen for each following OP. Steps are:

expr OP expr ^ OP // e.g. 2+3, pending '+'; shift OP '+'
expr OP expr OP ^ // 2+3+, waiting for next token …
expr OP expr OP expr ^ // 2+3+4, assuming there are no more OP's, reduce
expr OP expr ^ // 2+(3+4), reduce again
expr ^ // (2+(3+4))
Different terminals

Associativity mechanism works not just for a single terminal. It works for all terminals with the same associativity providing these have the same precedence. If two terminals PLUS and MINUS have the same precedence and associativity and the grammar is:

expr: expr PLUS expr;
expr: expr MINUS expr;

Then associativity will work for mixed sequences of these terminals as well:

expr PLUS expr MINUS expr // 2+3-4 becomes (2+3)-4 with %left and 2+(3-4) with right
expr MINUS expr PLUS expr // 5-6+7 becomes (5-6)+7 with %left and 5-(6+7) with right

Note that keywords %left and %right not only declare terminals with associativity. These also introduce new anonymous precedence class. It means that all terminals declared with these keywords on a single row will belong to the same precedence class.

Consider declaration:

%left PLUS "+" MINUS "-" // precedence class with two members only

Here PLUS and MINUS are the only terminals in their precedence class. Associativity mechanism is used only for terminals with the same precedence so it will work only for combinations of PLUS and MINUS. Shift-reduce conflicts with other terminals will be resolved using other rules, not associativity.

If terminals are declared like this:

%left PLUS "+" // lower precedence class with single member
%left MINUS "-" // higher precedence class with single member

associativity will not apply with mixed combinations of these terminals. For example input like 2+3-4 will be recognized as 2+(3-4) since MINUS has higher precedence than PLUS.

Behavior described above may seem a bit confusing. You may consider using a named precedence class to resolve all ambiguities like so:

%prclass ADDSUB
%left !ADDSUB PLUS "+"
%left !ADDSUB MINUS "-"
Nonassoc

Terminal symbols can be declared using %nonassoc keyword. Using this declaration explicitly states that terminal is not allowed to appear in any context where associativity is important. Shift-reduce conflicts with such terminal will generate an error.

The %nonassoc is to prevent user from accidentally writing a grammar where terminal appears in associative-like manner and is accepted in some way (shifted or reduced) due to some default settings.

Performance considerations

Parsers generated by Alpag work better when left recursion is used since it does not cause parser stack to grow. If choosing between right or left recursion is not relevant for the grammar, left recursion should be used.

Note that examples in this chapter suggest %left for PLUS and MINUS operators. In typical algebras addition and subtraction are associative, so either %left or %right could be used for them. That's why option giving best performance was chosen.

< Conflict context | Managing conflicts >
Alpag Manual