Tree Transformations

Once we have the AST we can transform it using the Treeregexp language. The code below (in file I2PIR.trg) shows a set of algebraic tree transformations whose goal is to produce machine independent optimizations.

{ #  Example of support code
  use List::Util qw(reduce);
  my %Op = (PLUS=>'+', MINUS => '-',
            TIMES=>'*', DIV => '/');
}
algebra = fold wxz zxw neg;

fold: /TIMES|PLUS|DIV|MINUS/:b(NUM, NUM)
=> {
  my $op = $Op{ref($b)};
  $NUM[0]->{attr} = eval
  "$NUM[0]->{attr} $op $NUM[1]->{attr}";
  $_[0] = $NUM[0];
}
zxw: TIMES(NUM, .) and {$NUM->{attr}==0}
=> { $_[0] = $NUM }
wxz: TIMES(., NUM) and {$NUM->{attr}==0}
=> { $_[0] = $NUM }
neg: NEG(NUM)
=> { $NUM->{attr} = -$NUM->{attr};
     $_[0] = $NUM }

A Treeregexp programs is made of treeregexp rules that describe what subtrees match and how transform them:

wxz: TIMES(., NUM) and {$NUM->{attr}==0}
=> { $_[0] = $NUM }

A rule has a name (wxz in the example), a term describing the shape of the subtree to match "TIMES(., NUM)" and two optional fields: a semantic condition expliciting the attribute constraints (the code after the reserved word and) and some transformation code that tells how to modify the subtree (the code after the big arrow =>). Each rule is translated into a subroutine ⁵with name the treerexexp rule name. Therefore, after compilation a subroutine wxz will be available. The dot in the term TIMES(., NUM) matches any tree. The semantic condition states that the attr entry of node NUM must be zero. The transformation code - that will be applied only if the matching succeeded - substitutes the whole subtree by its right child.

References to the nodes associated with some CLASS appearing in the term section can be accessed inside the semantic parts through the lexical variable $CLASS. If there is more than one node the associated variable is @CLASS. Variable $_[0] refers to the root of the subtree that matched.

Nodes inside a term can be described using linear regular expressions like in the fold transformation:

/TIMES|PLUS|DIV|MINUS/:b(NUM, NUM)

In such cases an optional identifier to later refer the node that matched can be specified (b in the example).

Tree transformations can be grouped in families:

algebra = fold wxz zxw neg;

Such families - and the objects they collect - are available inside the client program (read anew the code of the driver in section 2). Thus, if $t holds the AST resulting from the parsing phase, we can call its method s (for substitute) with args the @algebra family:

$t->s(our @algebra);

The s method of Parse::Eyapp::Node⁶proceeds to apply all the transformation in the family @algebra to tree $t until none of them matches. Thus, for input a = 2*(a+b)*(2-4/2) the parser produces the tree:

EXPS(
  ASSIGN(
    VAR[a],
    TIMES(
      TIMES(NUM[2],PLUS(VAR[a],VAR[b])),
      MINUS(NUM[2],DIV(NUM[4],NUM[2])
    )
  )
)

which is transformed by the call $t->s(@algebra) in:

EXPS(ASSIGN(VAR[a],NUM[0]))

Procesadores de Lenguaje 2007-03-01