Department of Computing, Macquarie University
Obr Source Program Tree Definition
All of the information that the structuring task of the compiler
obtains about the source program is embodied in the the source
program tree. If a particular item of information cannot be
accessed via this tree, then it cannot be obtained at
all. Information is encoded in the ``shape'' of the tree and in
the values stored at the leaves. This section defines the set of
possible source program trees by defining all of the concepts and
constructs of the Obr source language and how they are
represented.
Obr Concepts
Obr is a rather simple language, having only a small number of
distinct concepts. Each of the concepts is described below in a
separate subsection. The subsections are ordered according to the
order of appearance of the concepts in the Obr definition.
Identifier
An identifier is a freely chosen representation for an object. Its
properties are a string representation and the associated entity
determined by the scoping rules. The associated entity is determined
during semantic analysis; it is not stored at the time the tree is
constructed.
Each identifier occurrence in the source program is represented in the
source program tree by a leaf of type IdnLeaf
. The value
of the integer represented is stored at that leaf when the source
program tree is constructed.
Integer
An integer denotation is a representation of an integer
value. Its only property is the value represented.
According to the definition of Obr, any sign preceding a denotation is
not part of that denotation; therefore, only nonnegative values are
represented by integer denotations.
Each integer denotation in the source program is represented in the
source program tree by a leaf of type IntLeaf
. The value
of the integer represented is stored at that leaf when the source
program tree is constructed.
Boolean
A Boolean denotation is a representation of a Boolean value. Since
there are only two possible values, each value is represented by a
different kind of leaf in the source program tree, either
FalseLeaf
or TrueLeaf
.
Source
A source is a complete Obr program. Its properties are the scope that
it provides for parameter and variable declaration and the type of its
result.
Both properties are evaluated during semantic analysis; neither is
stored at the time the tree is constructed.
A source node is the root of the source program tree, and never
appears in any other position.
Declaration
A declaration is a construct that associates an identifier with an
entity. Its only property is the identifier/entity relationship it
establishes.
The identifier/entity relationship is established during semantic
analysis; it is not stored at the time the tree is constructed.
Statement
A statement is a construct that carries out an action but does not
return a value. It has no additional properties.
Expression
An expression is a construct that returns a value. Its only property
is the type of the value it returns.
The type returned by an expression is determined during semantic
analysis; it is not stored at the time the tree is constructed.
Obr Constructs
The following table summarises the constructs in the abstract syntax
of Obr. Each construct is described by a labelled context-free
grammar rule. For example, the "IntConst" construct appears as a
subtree whose root is a Declaration node. The Declaration node has
two children: an Identifier node and an Integer node.
AndExp: Expression : Expression Expression
ArrayVar: Declaration : Integer
AssignStmt: Statement : Expression Expression
BoolExp: Expression : Boolean
BoolVar: Declaration : Identifier
EqualExp: Expression : Expression Expression
ExitStmt: Statement
ForStmt: Statement: Identifier Expression Expression Statement*
GreaterExp: Expression : Expression Expression
IdnExp: Expression : Identifier
IdnLeaf: Identifier
IfStmt: Statement : Expression Statement* Statement*
IndexExp: Expression : Expression Expression
IntConst: Declaration : Identifier Expression
IntParam: Declaration : Identifier
IntExp: Expression : Integer
IntVar: Declaration : Identifier
LessExp: Expression : Expression Expression
LoopStmt: Statement : Statement*
MinusExp: Expression : Expression Expression
ModExp: Expression : Expression Expression
NegExp: Expression : Expression
NotEqualExp: Expression : Expression Expression
NotExp: Expression : Expression
ObrInt: Source : Declaration+ Statement*
OrExp: Expression : Expression Expression
PlusExp: Expression : Expression Expression
ReturnStmt: Statement : Expression
SlashExp: Expression : Expression Expression
StarExp: Expression : Expression Expression
WhileStmt: Statement : Expression Statement*
Most of the constructs have a fixed number of components. Variable
numbers of components are indicated in the productions by the
notation "X*", specifying "zero or more" Xs and "X+", specifying
"one or more" Xs. Thus an ObrInt has one or more Declaration
components and zero or more Statement components.
AssignStmt
An AssignStmt construct represents an assignment statement. The
first component Expression is the target of the assignment. It
must be either an identifier expression (IdnExp) or an array index
expression (IndexExp). The second component Expression is the
source of the value to be assigned. The type of the component
Expression must be the same as the type of the object related to
the identifier or array element by the visibility rules.
BoolExp
A Boolean may appear a denotation in an expression. The type of
the value returned by the root Expression of the BoolExp construct
is Boolean.
BoolVar, IntVar, ArrayVar, IntParam
These constructs associate an identifier with a new variable of
the specified type (Boolean for BoolVar,
integer for IntVar, and array for ArrayVar).
The effect of the Declaration is the establishment of a
relationship between the new variable and the component Identifier
of the construct. In the case
of an array variable the component is an integer that gives the
size of the array. The lowest index of an array is always zero.
An IntParam construct is just like an IntVar except that the
initial value of the variable is read from the standard input
when the program begins execution.
Dyadic Expressions
A dyadic expression applies a function to two operand values,
obtaining a single result. The first component Expression of the
dyadic expression is the left operand, the second component
Expression is the right operand. The function to be applied is
determined from the types returned by the operand expressions. The
type returned by the root Expression is determined by the
function. Each component expression must return a value that is
compatible with the corresponding argument type of the function.
AndExp, EqualExp, GreaterExp, LessExp, MinusExp,
ModExp, NotEqualExp, OrExp, PlusExp, SlashExp and StarExp are the dyadic
node types.
EmptyStmt
An empty statement does nothing.
ExitStmt
An exit statement terminates execution of the smallest enclosing loop
statement.
ForStmt
A ForStmt represents an iteration whose body may be executed zero
or more times. The component Identifier specifies a variable
whose value counts in steps of one from the value of the first
Expression component up to and including the value of the second
Expression component. The types of the variable and the two
expressions must be integer. The component Statements are the
body of the loop. They are executed once for each value of the
variable.
IdnExp
An identifier may appear as an operand in an expression. The type
of the value returned by the root Expression of the IdnExp
construct is the type of the object related to the identifier by
the visibility rules.
IdnLeaf
An identifier is a leaf of the source program tree. Its unique
encoding is established at the time the tree is built.
IfStmt
The IfStmt construct represents a conditional statement. The
component Expression of the construct is a tree that specifies the
condition, the first component Statements specifies the statements
to be executed if the condition yields true, and the second
component Statements specify the statements to be executed if the
condition yields false. The type of value returned by the
component Expression must be Boolean. Conditionals without an else
part are represented by an IfStmt construct with an empty second
sequence of statements.
IntConst
An IntConst construct associates an identifier with a new constant
integer value. The component expression calculates the value to be
used which must be constant. The relationship of the Declaration
is a relationship between the new constant and the value of the
component expression.
IntLeaf
An integer is a leaf of the source program tree. Its value is
established at the time the tree is built.
IntExp
An integer may appear as a denotation in an expression. The type
of the value returned by the root Expression of the IntExp
construct is integer.
LoopStmt
A LoopStmt represents an iteration whose body executes until an
exit statement is executed within that body. The component
Statements are the loop body.
Monadic Expressions
A monadic expression applies a function to one operand value,
obtaining a result. The function to be applied is determined from
the type returned by the operand expression. The type returned by
the root Expression of the Monadic construct is determined by the
function. The component expression must return a value that is
compatible with the argument type of the function.
NegExp and NotExp are the monadic node types.
ObrInt
An ObrInt construct represents an Obr program that returns an
integer value. The component Declarations represent the
parameters to the program and the variables declared in the
program. The component Statements represent the statements to be
executed in the body of the program.
ReturnStmt
A return statement terminates execution of the program, delivering
the value returned by the component Expression. The value of the
component Expression must be of the type to be returned by the
program.
WhileStmt
A WhileStmt represents an iteration whose body may be executed
zero or more times. The component Expression of the construct is
a tree that specifies the iteration's condition, while the
component Statements are the loop body controlled by that
condition. The type of value returned by the component Expression
must be Boolean.
Example
As a complete example, the following source tree represents the
Obr GCD program given in the Obr Language Reference Manual. The
coordinate of each node is given with its construct. Leaves
also have their converted value. Coordinates in square brackets
represent nodes that are the first node in a sequence.
ObrInt (
GCD,
List (
IntParam (x),
IntParam (y)),
List (
WhileStmt (NotEqualExp (IdnExp (x), IdnExp (y)),
List (
IfStmt (GreaterExp (IdnExp (x), IdnExp (y)),
List (AssignStmt (IdnExp (x),
MinusExp (IdnExp (x), IdnExp (y)))),
List (AssignStmt (IdnExp (y),
MinusExp (IdnExp (y), IdnExp (x))))))),
ReturnStmt (IdnExp (x))),
GCD)
Tony Sloane
Last modified: Mon Aug 18 9:24:56 EST 2003
Copyright (C) 2003-2011,
by Macquarie University. All rights reserved.