Parse::Eyapp
facilitates the construction of
abstract syntax trees (AST) through the %tree
directive.
Nodes in the AST are blessed in the production
name
.
A rhs can be
named using the %name IDENTIFIER
directive.
For each rhs name a
class/package with name IDENTIFIER
is created.
Symbolic tokens (like NUM
PRINT
or VAR
)
are considered by default semantic tokens.
String literals
(like '+'
, '/'
, etc.)
are - unless explictly
declared using the semantic token directive -
considered syntactic tokens.
When building the AST syntactic tokens do not yield
new nodes.
Semantic tokens however have their own. Thus
when feed with input b=2*a
the generated parser
produces the following AST2:
EXPS( ASSIGN( VAR(TERMINAL[b]), TIMES( NUM(TERMINAL[2]), VAR(TERMINAL[a])) ) )Nodes of the AST are hashes that can be decorated with new keys/attributes. The only reserved field is
children
which is a reference to the
array of children.
Nodes named TERMINAL
are built from the
tokens provided by the lexical analyzer.
The couple ($token, $attribute)
returned by the lexical analyzer
is stored under the keys token
and attr
.
TERMINAL
nodes also have the attribute children
which is
set to an anonymous empty list.
Observe the absence of TERMINAL
nodes corresponding to
tokens '='
and '*'
.
If we change the status of '*'
and '='
to semantic using the %semantic token
directive:
1 %semantic token '*' '=' 2 %right '=' 3 .... etc.we get a - concrete - syntax tree:
EXPS( ASSIGN( VAR(TERMINAL[b]), TERMINAL[=], TIMES( NUM(TERMINAL[2]), TERMINAL[*], VAR(TERMINAL[a]) ) # TIMES ) # ASSIGN )Let us now consider the input
2*(a+1)
.
The parser yields the tree:
EXPS( TIMES( NUM( TERMINAL[2]), exp_14( PLUS( VAR(TERMINAL[a]), NUM(TERMINAL[1])) ) # PLUS ) # TIMES )Two features are noticeable: the parenthesis rule
exp:
'(' exp ')'
had no name
and got automatically one: exp_14
. The name of a rhs by
default results from concatenating the left hand side of the rule
with the ordinal number of the rule3.
The second is that node exp_14
is useless and can be suppressed.
The %tree
directive can be accompanied of the %bypass
clause. A %tree bypass
produces an automatic bypass of any
node with only one child at tree-construction-time.
A bypass operation consists in returning the only child
of the node being visited to the father of the node and re-typing (re-blessing)
the node in the name of the production4.
Changing the line %tree
by %tree bypass
in file Infix.eyp
we get a more suitable
AST for input 2*(a+1)
:
EXPS(TIMES(NUM[2],PLUS(VAR[a],NUM[1])))
The node exp_14
has disapeared in this version
since the bypass operation applies to the rhs
of the rule exp: '(' exp ')'
:
Tokens '('
and ')'
are syntactic tokens
and therefore at tree construction time
only one child is left. Observe also the absence
of TERMINAL
nodes. Bypass clearly applies
to rules exp: NUM
and exp: VAR
since
they have only one element on their rhs. Therefore the
TERMINAL
node is re-blessed as NUM
and
VAR
respectively.
A consequence of the global scope application of %tree bypass
is that undesired bypasses may occur. Consider the tree
rendered for input -a*2
:
EXPS(TIMES(NEG,NUM))
What happened? The bypass is applied to the rhs
'-' exp
. Though the rhs has two symbols, token '-'
is
a syntactic token and at tree-construction-time
only exp
is left. The bypass
operation applies when building this node.
This undesired bypass can be avoided applying
the no bypass
directive to the
production:
exp : %no bypass NEG '-' exp %prec NEGNow the AST for
-a*2
is correct:
EXPS(TIMES(NEG(VAR),NUM))
Eyapp provides operators +
, *
and ?
for the creation of lists and optionals as in:
line: sts <EXPS + ';'>which states that a line is made of a non empty list of
EXPS
separated by semicolons.
By default the class name for such list is _PLUS_LIST
.
The %name
directive can be used to modify
the default name:
line: sts <%name EXPS + ';'>
Explicit actions can be specified by the programmer.
They are managed as anonymous subroutines
that receive as arguments the attributes of the symbols
in the rule and are executed each time a reduction
by that rule occurs. When running under the %tree
directive
this provides a mechanism to influence the shape of the AST.
Observe however that the grammar in the example is clean of actions:
Parse::Eyapp allowed us to produce a suitable AST without writing
any explicit actions.
Procesadores de Lenguaje 2007-03-01