Ambiguities and Conflicts

The former grammar is ambiguous. For instance, an expression like exp '-' exp followed by a minus '-' can be worked in more than one way. If we have an input like NUM - NUM - NUM the activity of a LALR(1) parser (the family of parsers to which Eyapp belongs) consists of a sequence of shift and reduce actions. A shift action has as consequence the reading of the next token. A reduce action is finding a production rule that matches and substituting the right hand side (rhs) of the production by the left hand side (lhs). For input NUM - NUM - NUM the activity will be as follows (the dot is used to indicate where the next input token is):

.NUM - NUM - NUM # shift
 NUM.- NUM - NUM # reduce exp: NUM 
 exp.- NUM - NUM # shift
 exp -.NUM - NUM # shift
 exp - NUM.- NUM # reduce exp: NUM
 exp - exp.- NUM # shift/reduce conflict

up to this point two different decisions can be taken: the next description can be

 exp.- NUM # reduce by exp: exp '-' exp

or:

 exp - exp -.NUM # shift '-'

that is called a shift-reduce conflict: the parser must decide whether to shift NUM or to reduce by the rule exp: exp - exp.

That is also the reason for the precedence declarations in the head section. Another kind of conflicts are reduce-reduce conflicts. They arise when more that rhs can be applied for a reduction action.

By associating priorities with tokens the programmer can tell Eyapp what syntax tree to build in case of conflict.

The declarations %nonassoc, %left and %right declare and associate a priority with the tokens that follow them. Tokens declared in the same line have the same precedence. Tokens declared in lines below have more precedence than those declared above. Thus, in the example we are saying that '+' and '-' have the same precedence but higher than '='. The final effect of '-' having greater precedence than '=' is that an expression like a=4-5 is interpreted as a=(4-5) and not as (a=4)-5. The use of %left applied to '-' indicates that - in case of ambiguity and a match between precedences - the parser must build the tree corresponding to a left parenthesization. Thus, 4-5-9 is interpreted as (4-5)-9.

The %prec directive can be used when a rhs is involved in a conflict and has no tokens inside or it has but the precedence of the last token leads to an incorrect interpretation. A rhs can be followed by an optional %prec token directive giving the production the precedence of the token

exp:   '-' exp %prec NEG { -$_[1] }

This solves the conflict in - NUM - NUM between (- NUM) - NUM and - (NUM - NUM). Since NEG has more priority than '-' the first interpretation will win.

Procesadores de Lenguaje 2007-03-01