CC Lecture 4
CC Lecture 4
Compiler Construction
CS-4207
Lecture – 04
Disclaimer: The Contents of this reader are borrowed from the book(s) mentioned in
the reference section.
Introduction
The input string is validated by the parser using a context-free grammar (CFG), which then
generates output for the compiler's subsequent phase. A parse tree or an abstract syntax tree
could be the output. Syntax Directed Translation is now used to interleave the semantic analysis
and syntax analysis phases of the compiler.
We conceptually parse the input token stream, create the parse tree, and then traverse the tree
as necessary to assess the semantic rules at the parse tree nodes, using both syntax-directed
definition and translation schemes. The evaluation of the semantic rules may result in the
generation of code, the saving of data in a symbol table, the issuance of error messages, or any
other actions. The outcome of analyzing the semantic rules is the translation of the token
stream.
What is Syntax Directed Translation?
Syntax Directed Translation has augmented rules to the grammar that facilitate semantic
analysis. SDT involves passing information bottom-up and/or top-down to the parse tree in
form of attributes attached to the nodes. Syntax-directed translation rules use 1) lexical values
of nodes, 2) constants & 3) attributes associated with the non-terminals in their definitions.
In syntax directed translation, every non-terminal can get one or more than one attribute or
sometimes 0 attribute depending on the type of the attribute. The value of these attributes
is evaluated by the semantic rules associated with the production rule.
In the semantic rule, attribute is VAL and an attribute may hold anything like a string, a
number, a memory location and a complex record
In Syntax directed translation, whenever a construct encounters in the programming
language then it is translated according to the semantic rules define in that particular
programming language.
1|Page
Atif Ishaq - Lecturer GC University, Lahore
Where
E.val is one of the attributes of E
num.lexval is the attribute returned by the lexical analyzer
Example
Syntax-directed translation is done by attaching rules or program fragments to productions
in a grammar. For example, consider an expression E generated by the production
E → E1 + T
E is sum of two sub expressions E1 (subscript is used to distinguished similar instances of
expressions) and T.
Pseudocode
translate E1;
translate T;
handle +;
The example is translation of an infix expression to postfix expression. Before exploring this
example let us understand two importantly related concepts
Attributes: An attribute is any quantity associated with programming construct (symbol in
grammar). Examples includes, data types of expression, the number of instructions in the
generated code, or the location of the first instruction for a construct, among many other
possibilities.
2|Page
Atif Ishaq - Lecturer GC University, Lahore
Postfix Notation
The Postfix notation for an expression E is inductively defined as follows
1. If E is a variable or constant then postfix notation of E is E itself.
2. If E is an expression of the form E1 op E2, where op is any binary operator, then the
` ` ` `
postfix notation for E is ! op, where are the postfix notations for E1 and
E2 respectively.
3. If E is a parenthesized expression of the form (E1), then the post fix notation of E is
same as the postfix notation of E1.
Example of Postfix notation
(9 – 5) + 2 have postfix form 95-2+
In this case 9, 5, and 2 are constants itself doe rul1 1 applies to these constants
Translation of 9 – 5 to 95- is through rule 2
The translation of (9 – 5) is through rule 3
For the entire expression, (9 – 5) treated as E1 and 2 as E2, and we get 95-2+ resultantly
by applying rule 2.
3|Page
Atif Ishaq - Lecturer GC University, Lahore
Evaluate the operator on operands and replace them all with result
Then repeat this process to scan a new operator and its corresponding operands.
Exercise
Consider the following postfix expression 952+-3*. Evaluate it following the above mentioned
evaluation steps.
952+-3* → 97-3* → 23* → 6
Types of Attributes
Attributes are of following two types
1. Synthesized Attributes
A Synthesized attribute is an attribute of the non-terminal on the left-hand side of a
production. Synthesized attributes represent information that is being passed up the parse
tree. The attribute can take value only from its children (Variables in the RHS of the
production).
For example let’s say A → BC is a production of a grammar, and A’s attribute is dependent
on B’s attributes or C’s attributes then it will be synthesized attribute.
E → E1 + T {E.val = Ex.val + T.val}
In this case E1.val derives is value from E1.val and T.val.
2. Inherited Attributes
An attribute of a nonterminal on the right-hand side of a production is called an inherited
attribute. The attribute can take value either from its parent or from its siblings (variables in
the LHS or RHS of the production).
For example, let’s say A → BC is a production of a grammar and B’s attribute is dependent
on A’s attributes or C’s attributes then it will be inherited attribute.
First we discuss synthesized attributes in detail. The grammar allows us to express the idea
of associating quantities with programing constructs like values and types with expression.
The attributes associates with not-terminals and terminals. Resultantly we associate rules
with the grammar productions that describes how attributes are computed at those nodes
where under consideration production is used to relate a node to its children.
A Syntax-directed definition associates
1. With each grammar symbol, a set of attributes, and
2. With each production, a set of semantic rules for computing the values of the attributes
associated with the symbols appearing in the production.
Syntax Directed Translation Scheme
4|Page
Atif Ishaq - Lecturer GC University, Lahore
Suppose a node N in a parse tree is labeled by the grammar symbol X. We write X.a to denote
the value of attribute a of X at that node. A parse tree showing the attribute values at each
node is called an annotated parse tree. Below is an annotated parse tree for the expression
9 – 5 + 2 with an attribute value a associated with nonterminals E and T. The value 95-2+
is postfix notation for the given expression.
We have already discussed that an attribute is said to be synthesized if its value at a parse-
tree node N is determined from attribute values at the children of N and at N itself.
Synthesized attributes have the desirable property that they can be evaluated during a single
bottom-up traversal of a parse tree. Below table describes the syntax directed definition for
infix to postfix translation of above annotated parse tree. Each nonterminal has a string-
valued attribute a that represents the postfix notation for the expression generated by that
nonterminal in a parse tree. The symbol || in the semantic rule is the operator for string
concatenation.
Productions Sematic Rules
E → E1 + T E.a = E1.a || T.a || ‘+’
E → E1 - T E.a = E1.a || T.a || ‘-’
E→T E.a = T.a
T→0 T.a = ‘0’
T→1 T.a = ‘1’
….. …..
T→9 T.a = ‘9’
AS we have already discussed that the postfix form of a digit is the digit itself; e.g., the
semantic rule associated with the production T → 9 defines T.a to be 9 itself whenever this
5|Page
Atif Ishaq - Lecturer GC University, Lahore
production is used at a node in a parse tree. The other digits are translated similarly. As
another example, when the production T → T is applied, the value of T.a becomes the value
of E.a.
Important Consideration
The string representing the translation of the nonterminal at the head of each production is
the concatenation of the translations of the nonterminals in the production body, in the same
order as in the production, with some optional additional strings interleaved. A syntax-
directed definition with this property is termed simple. We can see in the above table that
while defining the semantic rules the operands appear in the same order as in the production
body.
Tree Traversal
Tree traversals will be used to describe attribute evaluation and to specify how code pieces
in a translation scheme should be executed. A traversal of a tree begins at the root and travels
through each node in some sequence.
Depth-First Search
DFS (Depth-first search) is technique used for traversing tree or graph. Here backtracking
is used for traversal. In this traversal first the deepest node is visited and then backtracks to
it’s parent node if no sibling of that node exist. The below given procedure visit (N) is a
depth-first traversal that visits children node in a left-to-right order.
A syntax-directed definition does not impose any specific order for the evaluation of
attributes on a parse tree; any evaluation order that computes an attribute a after all the other
attributes that a depends on is acceptable. Synthesized attributes can be evaluated during
any bottom-up traversal, that is, a traversal that evaluates attributes at a node after having
evaluated attributes at its children.
6|Page
Atif Ishaq - Lecturer GC University, Lahore
Translation Scheme
We now have a look at a different strategy that doesn't require the manipulation of strings
and slowly creates the same translation by running program fragments. Semantic actions are
program pieces that are incorporated into production bodies. The place at which an action is
to be executed is indicated by enclosing it within curly brackets and writing it within the
production body. When drawing a parse tree for a translation scheme, we indicate an action
by constructing an extra child for it, connected by a dashed line to the node that corresponds
to the head of the production. Below parse tree is an action to convert expression 9 – 5 + 2
into postfix notation 95-2+.
The root of annotated parse tree show the first production of the grammar translating into
postfix notation. In a post order traversal, we first perform all the actions in the leftmost subtree
of the root, for the left operand, also labeled expr like the root. We then visit the leaf + at which
there is no action. We next perform the actions in the subtree for the right operand term and,
finally, the semantic action of { (‘ + ’)} at the extra node.
7|Page
Atif Ishaq - Lecturer GC University, Lahore
Since the productions for term have only a digit on the right side, that digit is printed by the
actions for the productions. No output is necessary for the production expr → term, and only
the operator needs to be printed in the action for each of the first two productions.
The semantic actions in the parse tree translate the infix expression 9 − 5 + 2 into 95 − 2 +
by printing each character in 9 − 5 + 2 exactly once, without using any storage for the
translation of subexpressions. While, the implementation of a translation scheme must ensure
that semantic actions are performed in the order they would appear during a post order traversal
of a parse tree.
Exercise-1
Construct a syntax-directed translation scheme that translates arithmetic expressions from infix
notation into prefix notation in which an operator appears before its operands; e.g., − is the
prefix notation for − . Give annotated parse trees for the inputs 9 − 5 + 2 and 9 − 5 ∗ 2.
Solution
Productions Translation Scheme
expr -> expr + term expr -> {print("+")} expr + term
| expr - term | {print("-")} expr - term
| term | term
term -> term * factor term -> {print("*")} term * factor
| term / factor | {print("/")} term / factor
| factor | factor
factor -> digit | (expr) factor -> digit {print(digit)}
| (expr)
8|Page
Atif Ishaq - Lecturer GC University, Lahore
Exercise-2
Construct a syntax-directed translation scheme that translates arithmetic expressions from post
fix notation into infix notation. Give annotated parse trees for the inputs
95 − 2 ∗ and 952 ∗ −.
Solution
Productions Translation Scheme
expr -> expr expr + expr -> expr {print("+")} expr +
| expr expr - | expr {print("-")} expr -
| expr expr * | {print("(")} expr
| expr expr / {print(")*(")} expr {print(")")} *
| digit | {print("(")} expr
{print(")/(")} expr {print(")")} /
| digit {print(digit)}
Parser
Parsing also referred as recursive decent parsing can be used to both to parse and to
implement syntax directed translation.
Parsing is the process of determining if a string of tokens can be generated by a grammar
For any Context Free Grammar there is a parser that takes at most O(n3) time to parse a
string of n tokens
Linear algorithms suffice for parsing programming language source code
Top-down parsing “constructs” a parse tree from root to leaves
Bottom-up parsing “constructs” a parse tree from leaves to root
Top down Parser
The top-down construction of a parse tree is done by starting with the root, labeled with the
starting nonterminal, and repeatedly performing the following two steps.
1. At node N, labeled with nonterminal A, select one of the productions for A and
construct children at N for the symbols in the production body.
2. Find the next node at which a subtree is to be constructed, typically the leftmost
unexpanded nonterminal of the tree.
9|Page
Atif Ishaq - Lecturer GC University, Lahore
10 | P a g e
Atif Ishaq - Lecturer GC University, Lahore
Predictive Parsing
Recursive descent parsing is a top-down method of syntax analysis in which a set of recursive
procedures is used to process the input.
Each nonterminal has one (recursive) procedure that is responsible for parsing the
nonterminal’s syntactic category of input tokens
When a nonterminal has multiple productions, each production is implemented in a
branch of a selection statement based on input look-ahead information
Predictive parsing is a special form of recursive descent parsing where we use one lookahead
token to unambiguously determine the parse operations
For the following statement
11 | P a g e
Atif Ishaq - Lecturer GC University, Lahore
References
1. Alferd. V. Aho, Monica, S Lam, Ravi Sethi, and Jeffry D. Ullman, “Compilers, Principles,
Techniques and Tools”, Chapter-2, Second Edition, Pearson,
2. https://www.geeksforgeeks.org/syntax-directed-translation-in-compiler-design/
3. https://www.javatpoint.com/syntax-directed-translation
4. https://www.cs.csustan.edu/~xliang/Courses/CS4300-19F/Notes/Ch2.pdf
12 | P a g e