CD Unit-4 &5
CD Unit-4 &5
Intermediate code generation – Variants of syntax trees, Three address code, Types and
declarations, Translation of expressions, Type checking, Control flow, Back patching.
Step 2
Construct DAG for e * ( b – c )
Step 3
Again no need to construct DAG for ( b – c ). Hence consider ( b – c ) * a
Step 4
Finally draw DAG for the expression e * ( b – c ) + ( b – c ) * a
5. if x relop goto L Conditional jump ( relop is <>, <, >, <=, >=, = )
6. Param x Procedure call
7. call p, n Procedure call
8. return z Procedure call
9. a=b[i] Index assignment
10. x[i]=y Index assignment
11. x = &y Pointer assignment
x = *y
12. *x = y Pointer assignment
Triple representation
In this, the use of temporary variables is avoided by referring the pointers in the
symbol table.
Triple is a structure with the atmost three fields such as,
op arg1 arg2
For example, consider the input string x = - a * b + - a * b
The following is the three address code.
t1 = uminus a
t2 = t1 * b
t3 = uminus a
t4 = t3 * b
t5 = t2 + t4
x = t5
The following is the triple representation.
op arg1 arg2
uminus a
* (0) b
uminus a
* (2) b
+ (1) (3)
= x (4)
Type expressions
Type expression denotes the type of a language construct. It can be a basic type Eg:
primitive data type like int, real, char, boolean etc., or a type error or a void: no
type.
Type equivalence
Type equivalence contains two important concepts.
Name equivalence have the same name. For example, a, b have same type and c has
different type.
Structural equivalence have the same structure. For example, a, b and c have the
same type.
Declaration
Whenever a declaration is encountered, then create a symbol entry for every local
name in a procedure.
The symbol table entry should contain type of name, how much storage the name
requires and a relative offset.
Storage layout for local names
If we know the type of a name, we can determine the amount of storage required for
the name at run time.
Type and relative address are saved in the symbol table entry for the name.
The ‘width’ of type is defined as the number of storage units required for objects of
that type.
SDT is used to compute types and their width for basic and array types.
5 Q: Explain about Type checking.
Rules for Type Checking (TC)
Type Checking can be Type Synthesis (TS) and Type Inference (TI).
Type Synthesis (TS) uses the types of its subexpressions in order to build the type of
expression. TS require names to declare before they are used.
Type Inference (TI) is used to determine the type of a language construct from the
way it is used.
Type conversions
Type conversion rules differ from one language to another. Java uses widening and
narrowing conversions between primitive types.
Unification
Testing equality of expressions is the concept of unification.
6 Q: Explain about Control Flow in Intermediate code generation.
The main idea of converting any flow of control statements to 3AC is to stimulate
the “branching” of the control flow.
Boolean expression in programming languages are often used to:
1. Alter the flow of control.
2. Compute logical value.
Flow of control statements may be converted to 3AC by using following functions:
1. new label – returns a new symbolic label each time it is called.
2. gen ( ) – generates the code (string) passed as parameter to it.
The following attributes are associated with non-terminals for the code generation:
1. code– contains the generated 3AC.
2. true – contains the label to which a jump takes place if the Boolean expression
associated evaluates to true.
3. false – contains the label to which a jump takes place if the Boolean expression
associated evaluates to false.
4. begin – contains the label / address pointing to the beginning of the code block
for the statement ‘generated’ by the non-terminal.
Boolean expressions
Boolean expressions consist of Boolean operations like AND (&&), OR ( || ) and
NOT ( ! ) as in C applied to the elements that are Boolean variables or relational
expressions of the form E1 rel E2
E1 rel E2
if – then else
while – do
Runtime environments – Stack allocation of space, Access to non-local data on the stack,
Heap management.
Code generation – Issues in the design of code generation, The target language, Addresses
in the target code, Basic blocks and flow graphs, A simple code generator.
21. begin
22. a [ 0 ] := -9999, a [ 10 ] := 9999
23. readarray
24. quicksort(1, 9)
25. end
The following is the activation tree corresponding to the output of quicksort.
Activation Record ( AR )
Activation Record is a memory block used for information management for single
execution of a procedure.
The following is the activation record ( Read from bottom to top )
Actual parameters
Returned values
Control link
Access link
Saved machine status
Local data
Temporaries
Register allocation
Instructions involving register operands are usually shorter and faster than those
involving operands in memory. Therefore, efficient utilization of registers is
particularly important in generating good code.
The most frequently used variables should be kept in process register for faster access.
The use of registers if often subdivided into two sub problems:
During “register allocation”, we select the set of variables that will reside in
register at a point in the program.
During “register assignment”, we pick the specific register that a variables will
reside in.
For example, consider t1 = a * b
t2 = t1 + c
t3 = t2 / d
optimal machine code sequence is,
L R1, a
M R1, b
A R2, c
D R2, d
ST R1, t3
3. Computation operation
The general form is OP dst, src1, src2 where OP is a operator like ADD, SUB.
For example, ADD R1, R2, R3 adds R2 and R3 values and stores into R1.
4. Unconditional jump ( BR )
The general form is BR L where BR is branch. This causes control to branch to the
machine instruction with label L.
5. Conditional jump
The general form is Bcond R, L where R is a register, L is label and cond stands for
any of the common tests on values in register R.
For example, BGTZ R2, L This instruction causes a jump to label L if the value in
register R2 is greater than zero and allows control to pass to the next machine
instruction if not.
6 Q: Addresses in the target code
In this, it explains storage allocation strategies namely static and stack allocation
strategies.
Static allocation strategy
In static allocation names are bound to the storage as the program is compiled. Since
bindings don’t change at runtime, its names are bound to the same storage whenever
the procedure is activated.
The compiler must decide where the activation record is to go, relative to the target
code. Once this decision is made, the position of each activation record and the storage
for each name in the record is fixed.
Compiler gives the address of code which is required by the target code.
Limitations:
The size of data objects must be known at compile time.
Recursive procedures are not allowed because all the activations of a procedure use
the same bindings for local names.
Data structures cannot be created dynamically since there is no mechanism for
storage allocation at runtime.
Stack allocation strategy
In stack allocation, storage is organized as stack and activation records are pushed
and popped as the activation begins and ends respectively.
Storage for locals in each call of a procedure is contained in the activation record.
Let us apply step1 and step2 of algorithm of algorithm to identify basic blocks.
Step1
1. i = 0
2. if ( i > 10 ) goto 6
3. a[ i ] = 0
4. i = i + 1
5. goto 2
6. end
Based on rule 1 of step1, 1 statement is leader.
Based on rule 2 of step1, 2 and 5 are leaders.
Based on rule 3 of ste1, 3 is a leader.
Step2
For each leader, construct the basic blocks.
L 1. i = 0 B1
L
2. if (i > 10 ) GOTO 6 B2
L 3. a[i] = 0
4. i = i + 1
5. GOTO 2 B3
L 6. end B4
Flow graph
Flow graphs are used to represent the basic blocks and their relationship by a directed
graph.
There exists an edge from block 1 to block2 iff it is possible for the first instruction in
block2 to immediately flow to the last instruction in block1.
For example, Let us write the flow graph for the following basic blocks.
1. i = 0 B1
2. if (i > 10 ) GOTO 6 B2
4. a[i] = 0
5. i = i + 1
6. GOTO 2 B3
3. end B4
B1 1. i = 0
B2 2. if (i > 10 ) GOTO B4
B3 4. a[i] = 0
5. i = i + 1
6. GOTO B2
B4 3. end
t1 = a + b
t2 = a + b
t3 = t1 * t2
t4 = t3 + c
t5 = t4 + d
x = t5
The local common sub expression in the above basic block are
t1 = a + b
t2 = a + b
Hence these local common sub expressions can be eliminated. The block obtained
after eliminating local common sub expression is shown below.
t1 = a + b
t2 = t1 * t1
t3 = t2 + c
t4 = t3 + d
x = t4
B3 j=a+d
b) B1 B2
b=a+d b=a+d
c=b h=b
B3 j=b
x=b+c
…
d=x+y
Constant folding
In folding technique, the computation of a constant is done at compile time instead of
execution time and further the computed value of the constant is used.
For example, k = ( 22 / 7 ) * r * r
Here folding is done by performing the computation of ( 22 / 7 ) at compile time. So it
can be optimized as,
k = 3.14 * r * r
Loop optimizations
The major source of code optimization is loops, especially the inner loops.
Most of the run time is spent inside the loops which can be reduced the number of
instructions in the inner loop.
The following are the loop optimization techniques.
1. Code motion.
2. Elimination of induction variables.
3. Strength reduction.
1. Code motion:-
Code motion is a technique which is used to move the code outside the loop.
If there exists any expression outside the loop which the result is unchanged even after
executing the loop many times, then such expression should be placed just before the
loop.
For example, while ( x ! = max =3 )
{
x = x + 2;
}
Here the expression max -3 is a loop invariant computation. So this can be optimized
as follows:
k = max -3;
while ( x ! = k )
{
x = x + 2;
}
3. Algebraic simplifications:-
Algebraic identities that occur frequently and which is worth considering them can
be simplified.
For example, X = X * 1 or X = 0 + X is often produced by straight forwards
intermediate code generation algorithms. Hence they can be eliminated easily through
peephole optimization.
4. Strength reduction:-
Replace expensive statements by a cheaper one.
For example, X2 is expensive operation. Hence replace this by X * X which is cheaper
one.
3 Q: Explain about Data flow analysis
Data flow analysis:
– Flow-sensitive: sensitive to the control flow in a function
– Intra procedural analysis
Examples of optimizations:
– Constant propagation
– Common sub expression elimination
– Dead code elimination
Data flow analysis abstraction:
– For each point in the program, combines information of all the instances of the
same program point.
Reaching Definitions