CD Unit - 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

UNIT – 4

Runtime Environment: storage organization, runtime storage


allocation, activation records, procedure calls, displays.

Code optimization: The Principle sources of optimization, Basic


Blocks, Optimization of Basic Blocks, Structure Preserving,
Transformations, Flow Graphs, Loop optimization, Data Flow Analysis,
Peephole optimization.

Storage Organization
 When the target program executes then it runs in its own logical address
space in which the value of each program has a location.
 The logical address space is shared among the compiler, operating system
and target machine for management and organization. The operating system
is used to map the logical address into physical address which is usually
spread throughout the memory.

Subdivision of Run-time Memory:

 Runtime storage comes into blocks, where a byte is used to show the
smallest unit of addressable memory. Using the four bytes a machine word
can form. The object of multibyte is stored in consecutive bytes and gives
the first byte address.
 Run-time storage can be subdivided to hold the different components of an
executing program:
1. Generated executable code
2. Static data objects
3. Dynamic data-object- heap
4. Automatic data objects- stack

STORAGE ALLOCATION TECHNIQUES

I. Static Storage Allocation

The names are bound with the storage at compiler time only and hence every time
procedure is invoked its names are bound to the same storage location only So
values of local names can be retained across activations of a procedure. Here
compiler can decide where the activation records go with respect to the target code
and can also fill the addresses in the target code for the data it operates on.

 For any program, if we create a memory at compile time, memory will be


created in the static area.
 For any program, if we create a memory at compile-time only, memory is
created only once.
 It doesn’t support dynamic data structure i.e memory is created at compile-
time and deallocated after program completion.
 The drawback with static storage allocation is recursion is not supported.
 Another drawback is the size of data should be known at compile time

Eg- FORTRAN was designed to permit static storage allocation.

II. Stack Storage Allocation

 Storage is organized as a stack and activation records are pushed and popped
as activation begins and end respectively. Locals are contained in activation
records, so they are bound to fresh storage in each activation.
 Recursion is supported in stack allocation

III. Heap Storage Allocation


 Memory allocation and deallocation can be done at any time and at any
place depending on the requirement of the user.
 Heap allocation is used to dynamically allocate memory to the variables and
claim it back when the variables are no longer required.
 Recursion is supported.

PARAMETER PASSING: The communication medium among procedures is


known as parameter passing. The values of the variables from a calling procedure
are transferred to the called procedure by some mechanism.

Basic terminology:

 R- value: The value of an expression is called its r-value. The value


contained in a single variable also becomes an r-value if it appears on the
right side of the assignment operator. R-value can always be assigned to
some other variable.
 L-value: The location of the memory(address) where the expression is
stored is known as the l-value of that expression. It always appears on the
left side of the assignment operator.
 I. Formal Parameter: Variables that take the information passed by the
caller procedure are called formal parameters. These variables are declared
in the definition of the called function.
 ii. Actual Parameter: Variables whose values and functions are passed to
the called function are called actual parameters. These variables are
specified in the function call as arguments.

Different ways of passing the parameters to the procedure:

 Call by Value: In call by value the calling procedure passes the r-value of
the actual parameters, and the compiler puts that into the called procedure’s
activation record. Formal parameters hold the values passed by the calling
procedure, thus any changes made in the formal parameters do not affect the
actual parameters.
 Call by Reference: call by reference the formal and actual parameters
refers to same memory location. The l-value of actual parameters is copied
to the activation record of the called function. Thus, the called function has
the address of the actual parameters. If the actual parameters do not have a l-
value (eg- i+3) then it is evaluated in a new temporary location and the
address of the location is passed. Any changes made in the formal parameter
are reflected in the actual parameters (because changes are made at the
address).

 Call by Copy Restore In call by copy restore compiler copies the value in
formal parameters when the procedure is called and copies them back in
actual parameters when control returns to the called function. The r-values
are passed and on return r-value of formals are copied into l-value of actuals.
 Call by Name In call by name the actual parameters are substituted for
formals in all the places formals occur in the procedure. It is also referred to
as lazy evaluation because evaluation is done on parameters only when
needed.

Activation Records [important]


Activation Record :

An activation record is a contiguous block of storage that manages information


required by a single execution of a procedure. When you enter a procedure, you
allocate an activation record, and when you exit that procedure, you de-allocate it.
Basically, it stores the status of the current activation function. So, whenever a
function call occurs, then a new activation record is created, and it will be pushed
onto the top of the stack. It will remain in stack till the execution of that function.
So, once the procedure is completed and it is returned to the calling function, this
activation function will be popped out of the stack.

If a procedure is called, an activation record is pushed into the stack, and it is


popped when the control returns to the calling function.

Activation Record includes some fields which are –

Return values, parameter list, control links, access links, saved machine status,
local data, and temporaries.

Activation Record

Temporaries:

The temporary values, such as those arising in the evaluation of expressions, are
stored in the field for temporaries.

Local data:

The field for local data holds data that is local to an execution of a procedure.

Saved Machine States:

The field for Saved Machine Status holds information about the state of the
machine just before the procedure is called. This information includes the value of
the program counter and machine registers that have to be restored when control
returns from the procedure.

Access Link:

It refers to information stored in other activation records that is non-local. The


access link is a static link, and the main purpose of the access link is to access the
data which is not present in the local scope of the activation record. It is a static
link.

Let’s take an example to understand this –

#include
<stdio.h>
int g=12;
void
Geeks()
{

printf("%d"
, g);
}
void main()
{
Geeks();
}

Now, In this example, when Geeks() is called in a main(), the task of Geeks() in
main() is to print(g), but g is not defined within its scope(local scope of Geeks());
in this case, Geeks() would use the access link to access ‘g’ from Global Scope and
then print its value (g=12).

As a chain of access links (think of scopes), the program traces its static structure.

Now, let’s take another example to understand the concept of access link in detail –

#include <stdio.h>

int main (int argc, char


*argv[]) {
int a = 100;
int geeks(int b) {
int c = a+b;
return c;
}

int geek1(int b) {
return geeks(2*b);
}
(void) printf("The answer is
%d\n", geek1(a));
return 0;
}
There are no errors detected while compiling the program, and the correct answer
is displayed, which is 300. Now, let’s discuss the nesting paths. Nested procedures
include an AR(Activation Record) access link that enables users to access the AR
of the most recent action taken by their immediately outer procedure. So, in this
example, the access link for geeks and access link for geeks1 would each point to
the AR of the activation of the main.

Each activation record gets a pointer called the access link that facilitates the direct
implementation of the normal static scope rule.

Control Links :

In this case, it refers to an activation record of the caller. They are generally used
for links and saved status. It is a dynamic link in nature. When a function calls
another function, then the control link points to the activation record of the caller.

Record A contains a control link pointing to the previous record on the stack.
Dynamically executed programs are traced by the chain of control links.

Example –

#include<stdio.h>
int geeks(int x)
{
printf("value of x
is: %d", x);
}
int main()
{
geeks(10);
}

Let’s take another example –

#include <stdio.h>
int geeks();
int main() {
int x, y;
//Calling a function
geeks();
return 0;
}

int geeks() {

//Function called from


main()
printf("Function called
from main()");
return 0;
}

When the function geeks() are called, it uses the access link method to access x and
y (statically scoped) in its calling function main ().

Parameter List:

The field for parameters list is used by the calling procedure to supply parameters
to the called procedure. We show space for parameters in the activation record, but
in practice, parameters are often passed in machine registers for greater efficiency.

Return value:

The field for the return value is used by the called procedure to return a value to
the calling procedure. Again, in practice, this value is often returned in a register
for greater efficiency.

Procedure calls
Procedures call

Procedure is an important and frequently used programming construct


for a compiler. It is used to generate good code for procedure calls and
returns.

Calling sequence:

The translation for a call includes a sequence of actions taken on entry


and exit from each procedure. Following actions take place in a calling
sequence:

 When a procedure call occurs then space is allocated for activation


record.
 Evaluate the argument of the called procedure.
 Establish the environment pointers to enable the called procedure
to access data in enclosing blocks.
 Save the state of the calling procedure so that it can resume
execution after the call.
 Also save the return address. It is the address of the location to
which the called routine must transfer after it is finished.
 Finally generate a jump to the beginning of the code for the called
procedure.

Let us consider a grammar for a simple procedure call statement

5. S → call id(Elist)
6. Elist → Elist, E
7. Elist → E

A suitable transition scheme for procedure call would be:

Production Rule Semantic Action


S → call id (Elist) for each item p on QUEUE do
GEN (param p)
GEN (call id.PLACE)
Elist → Elist, E append E.PLACE to the end of
QUEUE
Elist → E initialize QUEUE to contain only
E.PLACE

A queue is used to store the list of parameters in the procedure call.

Displays
An access link is a pointer to each activation record that obtains a direct
implementation of lexical scope for nested procedures. In other words, an access
link is used to implement lexically scoped language. An “access line” can be
required to place data required by the called procedure.

An improved scheme for handling static links defined at various lexical levels is
the usage of a data structure called display. A display is an array of pointers to the
activation records. Display [0] contains a pointer to the activation record of the
most recent activation of a procedure defined at lexical level 0.

The number of elements in the display array is given by the maximum level of
nesting in the input source program. In the display scheme of accessing the non-
local variables defined in the enclosing procedures, each procedure on activation
stores a pointer to its activation record in the display array at its lexical level.

It saves the previous value at that location in the display array and restores it when
the procedure exists. The advantage of the display scheme is that the activation
record of any enclosing procedure at lexical level ‘n’ can be directly fetched using
Display [n] as opposed to traversing of the access links in the previous scheme.

There are two types of scope rules for the non-local names are as follows −

Static Scope or Lexical Scope

Lexical scope is known as static scope. In this type of scope, the scope is tested by
determining the text of the program. An example such as PASCAL, C and ADA
are the languages that use the static scope rule. These languages are also known as
block structured languages.

Dynamic Scope

The dynamic scope allocation rules are used for non-block structured languages. In
this type of scoping, the non-local variables access refers to the non-local data
which is declared in most recently called and still active procedures. There are two
methods to implement non-local accessing under dynamic scoping are −

Deep Access − The basic concept is to keep a stack of active variables. Use control
links instead of access links and to find a variable, search the stack from top to
bottom looking for the most recent activation record that contains the space for
desired variables. This method of accessing nonlocal variables is called deep
access. Since search is made “deep” in the stack, hence the method is called deep
access. In this method, a symbol table should be used at runtime.

Shallow Access − The idea to keep central storage and allot one slot for every
variable name. If the names are not created at runtime, then the storage layout can
be fixed at compile time
Principle sources of Optimization
A transformation of a program is called local if it can be performed by

looking only at the statements in a basic block; otherwise, it is called global. Many

transformations can be performed at both the local and global levels. Local

transformations are usually performed first.

Function-Preserving Transformations

There are a number of ways in which a compiler can improve a program

without changing the function it computes.

Function preserving transformations examples:

Common sub expression elimination

Copy propagation,

Dead-code elimination

Constant folding

The other transformations come up primarily when global optimizations are

performed.

Frequently, a program will include several calculations of the offset in an

array. Some of the duplicate calculations cannot be avoided by the programmer

because they lie below the level of detail accessible within the source language.

***

Common Sub expressions elimination:


• An occurrence of an expression E is called a common sub-expression if E was

previously computed, and the values of variables in E have not changed since the

previous computation. We can avoid recomputing the expression if we can use the

previously computed value.

•For example

t1:=4*i
t2:=a[t1]
t3:=4*j
t4:=4*i
t5:=n

t6:=b[t4]+t5
The above code can be optimized using the common sub-expression elimination as

t1:=4*i
t2:=a[t1]
t3:=4*j
t5:=n

t6:=b[t1]+t5
The common sub expression t4: =4*i is eliminated as its computation is already in t1

and the value of i is not been changed from definition to use.

CopyPropagation:

Assignments of the form f: = g called copy statements, or copies for short. The idea

behind the copy-propagation transformation is to use g for f, whenever possible after

the copy statement f: = g. Copy propagation means use of one variable instead of
another. This may not appear to be an improvement, but as we shall see it gives us an

opportunity to eliminate x.

•For example:

x=Pi;

A=x*r*r;

The optimization using copy propagation can be done as follows: A=Pi*r*r;

Here the variable x is eliminated

Dead-Code Eliminations:

A variable is live at a point in a program if its value can be used subsequently;


otherwise, it is dead at that point. A related idea is dead or useless code, statements

that compute values that never get used. While the programmer is unlikely to

introduce any dead code intentionally, it may appear as the result of previous

transformations.

Example:

i=0;

if(i=1)

a=b+5;

Here, ‘if’ statement is dead code because this condition will never get satisfied.
Constant folding:

Deducing at compile time that the value of an expression is a constant and using the

constant instead is known as constant folding. One advantage of copy propagation is

that it often turns the copy statement into dead code.

For example,

a=3.14157/2 can be replaced by

a=1.570 thereby eliminating a division operation.

Loop Optimizations:

In loops, especially in the inner loops, programs tend to spend the bulk of their time.

The running time of a program may be improved if the number of instructions in an

inner loop is decreased, even if we increase the amount of code outside that loop.

Three techniques are important for loop optimization:


Ø Code motion, which moves code outside a loop;
Ø Induction-variable elimination, which we apply to replace variables from inner
loop.

Ø Reduction in strength, which replaces and expensive operation by a cheaper one,


such as a multiplication by an addition.

Fig. 5.2 Flow graph

Code Motion:

An important modification that decreases the amount of code in a loop is code

motion. This transformation takes an expression that yields the same result

independent of the number of times a loop is executed (a loop-invariant


computation) and places the expression before the loop. Note that the notion “before

the loop” assumes the existence of an entry for the loop. For example, evaluation of

limit-2 is a loop-invariant computation in the following while-statement:

while (i <= limit-2) /* statement does not change limit*/

Code motion will result in the equivalent of

t= limit-2;
while (i<=t) /* statement does not change limit or t */

Induction Variables:

Loops are usually processed inside out. For example, consider the loop

around B3. Note that the values of j and t4 remain in lockstep; every time the value

of j decreases by 1, that of t4 decreases by 4 because 4*j is assigned to t4. Such

identifiers are called induction variables.

When there are two or more induction variables in a loop, it may be possible to get

rid of all but one, by the process of induction-variable elimination. For the inner

loop around B3 in Fig.5.3 we cannot get rid of either j or t4 completely; t4 is used in

B3 and j in B4.

Basic Blocks:
Basic Block is a straight line code sequence that has no branches in and out
branches except to the entry and at the end respectively. Basic Block is a set of
statements that always executes one after other, in a sequence.

The first task is to partition a sequence of three-address codes into basic blocks. A
new basic block is begun with the first instruction and instructions are added until
a jump or a label is met. In the absence of a jump, control moves further
consecutively from one instruction to another. The idea is standardized in the
algorithm below:

Algorithm: Partitioning three-address code into basic blocks.

Input: A sequence of three address instructions.

Process: Instructions from intermediate code which are leaders are determined.
The following are the rules used for finding a leader:

8. The first three-address instruction of the intermediate code is a leader.


9. Instructions that are targets of unconditional or conditional jump/goto
statements are leaders.
10.Instructions that immediately follow unconditional or conditional jump/goto
statements are considered leaders.

Each leader thus determined its basic block contains itself and all instructions up to
excluding the next leader.

Example 1:

The following sequence of three-address statements forms a basic block:

t1 := a*a

t2 := a*b

t3 := 2*t2

t4 := t1+t3

t5 := b*b

t6 := t4 +t5

A three address statement x:= y+z is said to define x and to use y and z. A name in
a basic block is said to be live at a given point if its value is used after that point in
the program, perhaps in another basic block.

Example 2:
Intermediate code to set a 10*10 matrix to an identity matrix:

1) i=1 //Leader 1 (First statement)


2) j=1 //Leader 2 (Target of 11th statement)
3) t1 = 10 * i //Leader 3 (Target of 9th statement)
4) t2 = t1 + j
5) t3 = 8 * t2
6) t4 = t3 - 88
7) a[t4] = 0.0
8) j = j + 1
9) if j <= 10 goto (3)
10) i = i + 1 //Leader 4 (Immediately
following Conditional goto statement)
11) if i <= 10 goto (2)
12) i = 1 //Leader 5 (Immediately
following Conditional goto statement)
13) t5 = i - 1 //Leader 6 (Target of 17th
statement)
14) t6 = 88 * t5
15) a[t6] = 1.0
16) i = i + 1
17) if i <= 10 goto (13)
The given algorithm is used to convert a matrix into identity matrix i.e. a matrix
with all diagonal elements 1 and all other elements as 0.

Steps (3)-(6) are used to make elements 0, step (14) is used to make an element 1.
These steps are used recursively by goto statements.

There are 6 Basic Blocks in the above code :

B1) Statement 1

B2) Statement 2
B3) Statement 3-9

B4) Statement 10-11

B5) Statement 12

B6) Statement 13-17

Optimization of basic blocks


Optimization is applied to the basic blocks after the intermediate code generation
phase of the compiler. Optimization is the process of transforming a program that
improves the code by consuming fewer resources and delivering high speed. In
optimization, high-level codes are replaced by their equivalent efficient low-level
codes. Optimization of basic blocks can be machine-dependent or machine-
independent. These transformations are useful for improving the quality of code
that will be ultimately generated from basic block.

There are two types of basic block optimizations:

11.Structure preserving transformations


12.Algebraic transformations

Structure-Preserving Transformations:
The structure-preserving transformation on basic blocks includes:

13.Dead Code Elimination


14.Common Subexpression Elimination
15.Renaming of Temporary variables
16.Interchange of two independent adjacent statements
1.Dead Code Elimination:

Dead code is defined as that part of the code that never executes during the
program execution. So, for optimization, such code or dead code is eliminated. The
code which is never executed during the program (Dead code) takes time so, for
optimization and speed, it is eliminated from the code. Eliminating the dead code
increases the speed of the program as the compiler does not have to translate the
dead code.

Example:

// Program with Dead code


int main()
{
x = 2
if (x > 2)
cout << "code"; // Dead code
else
cout << "Optimization";
return 0;
} // Optimized Program without dead code
int main()
{
x = 2;
cout << "Optimization"; // Dead Code Eliminated
return 0;
}

2.Common Subexpression Elimination:

In this technique, the sub-expression which are common are used frequently are
calculated only once and reused when needed. DAG ( Directed Acyclic Graph ) is
used to eliminate common subexpressions. ex:
3.Renaming of Temporary Variables:

Statements containing instances of a temporary variable can be changed to


instances of a new temporary variable without changing the basic block value.

Example: Statement t = a + b can be changed to x = a + b where t is a temporary


variable and x is a new temporary variable without changing the value of the basic
block.

4.Interchange of Two Independent Adjacent Statements:

If a block has two adjacent statements which are independent can be interchanged
without affecting the basic block value.

Example:

t1 = a + b
t2 = c + d
These two independent statements of a block can be interchanged without affecting
the value of the block.

Algebraic Transformation:

Countless algebraic transformations can be used to change the set of expressions


computed by a basic block into an algebraically equivalent set. Some of the
algebraic transformation on basic blocks includes:

17.Constant Folding
18.Copy Propagation
19.Strength Reduction

1. Constant Folding:

Solve the constant terms which are continuous so that compiler does not need to
solve this expression.

Example:

x = 2 * 3 + y ⇒ x = 6 + y (Optimized code)

2. Copy Propagation:

It is of two types, Variable Propagation, and Constant Propagation.

Variable Propagation:

x=y ⇒ z = y + 2 (Optimized code)

z=x+2

Constant Propagation:

x=3 ⇒ z = 3 + a (Optimized code)

z=x+a

3. Strength Reduction:

Replace expensive statement/ instruction with cheaper ones.


x = 2 * y (costly) ⇒ x = y + y (cheaper)
x = 2 * y (costly) ⇒ x = y << 1 (cheaper)

Loop Optimization:

Loop optimization includes the following strategies:

20.Code motion & Frequency Reduction


21.Induction variable elimination
22.Loop merging/combining
23.Loop Unrolling

1. Code Motion & Frequency Reduction

Move loop invariant code outside of the loop.

// Program with loop variant inside loop


int main()
{
for (i = 0; i < n; i++) {
x = 10;
y = y + i;
}
return 0;
} // Program with loop variant outside loop
int main()
{
x = 10;
for (i = 0; i < n; i++)
y = y + i;
return 0;
}

2. Induction Variable Elimination:

Eliminate various unnecessary induction variables used in the loop.


// Program with multiple induction variables
int main()
{
i1 = 0;
i2 = 0;
for (i = 0; i < n; i++) {
A[i1++] = B[i2++];
}
return 0;
} // Program with one induction variable
int main()
{
for (i = 0; i < n; i++) {
A[i] = B[i]; // Only one induction variable
}
return 0;
}

3. Loop Merging/Combining:

If the operations performed can be done in a single loop then, merge or combine
the loops.

// Program with multiple loops


int main()
{
for (i = 0; i < n; i++)
A[i] = i + 1;
for (j = 0; j < n; j++)
B[j] = j - 1;
return 0;
} // Program with one loop when multiple loops are merged
int main()
{
for (i = 0; i < n; i++) {
A[i] = i + 1;
B[i] = i - 1;
}
return 0;
}

4. Loop Unrolling:

If there exists simple code which can reduce the number of times the loop executes
then, the loop can be replaced with these codes.

Flow graphs
Flow graph is a directed graph. It contains the flow of control information for the
set of basic block.

A control flow graph is used to depict that how the program control is being parsed
among the blocks. It is useful in the loop optimization.

Flow graph for the vector dot product is given as follows:


 Block B1 is the initial node. Block B2 immediately follows B1, so from B2
to B1 there is an edge.
 The target of jump from last statement of B1 is the first statement B2, so
from B1 to B2 there is an edge.
 B2 is a successor of B1 and B1 is the predecessor of B2.
Loop optimization:
Loop optimization is most valuable machine-independent optimization because
program's inner loop takes bulk to time of a programmer.

If we decrease the number of instructions in an inner loop then the running time of
a program may be improved even if we increase the amount of code outside that
loop.

For loop optimization the following three techniques are important:

24.Code motion
25.Induction-variable elimination
26.Strength reduction

1.Code Motion:

Code motion is used to decrease the amount of code in loop. This transformation
takes a statement or expression which can be moved outside the loop body without
affecting the semantics of the program.

For example

In the while statement, the limit-2 equation is a loop invariant equation.

27.while (i<=limit-2) /*statement does not change limit*/


28.After code motion the result is as follows:
29. a= limit-2;
30. while(i<=a) /*statement does not change limit or a*/

2.Induction-Variable Elimination

Induction variable elimination is used to replace variable from inner loop.


It can reduce the number of additions in a loop. It improves both code space and
run time performance.

In this figure, we can replace the assignment t4:=4*j by t4:=t4-4. The only problem
which will be arose that t4 does not have a value when we enter block B2 for the
first time. So we place a relation t4=4*j on entry to the block B2.

3.Reduction in Strength

 Strength reduction is used to replace the expensive operation by the cheaper


once on the target machine.
 Addition of a constant is cheaper than a multiplication. So we can replace
multiplication with an addition within the loop.
 Multiplication is cheaper than exponentiation. So we can replace
exponentiation with multiplication within the loop.

Example:

31.while (i<10)
32. {
33.j= 3 * i+1;
34.a[j]=a[j]-2;
35.i=i+2;
36. }

After strength reduction the code will be:

37.s= 3*i+1;
38. while (i<10)
39. {
40. j=s;
41. a[j]= a[j]-2;
42. i=i+2;
43. s=s+6;
44. }

In the above code, it is cheaper to compute s=s+6 than j=3 *i

Data flow analysis


Global data flow analysis

 To efficiently optimize the code compiler collects all the information about
the program and distribute this information to each block of the flow graph.
This process is known as data-flow graph analysis.
 Certain optimization can only be achieved by examining the entire program.
It can't be achieve by examining just a portion of the program.
 For this kind of optimization user defined chaining is one particular
problem.
 Here using the value of the variable, we try to find out that which definition
of a variable is applicable in a statement.

Based on the local information a compiler can perform some optimizations. For
example, consider the following code:

45.x = a + b;
46. x = 6 * 3
 In this code, the first assignment of x is useless. The value computer for x is
never used in the program.
 At compile time the expression 6*3 will be computed, simplifying the
second assignment statement to x = 18;

Some optimization needs more global information. For example, consider the
following code:

47.a = 1;
48. b = 2;
49. c = 3;
50. if (....) x = a + 5;
51. else x = b + 4;
52. c = x + 1;

In this code, at line 3 the initial assignment is useless and x +1 expression can be
simplified as 7.

But it is less obvious that how a compiler can discover these facts by looking only
at one or two consecutive statements. A more global analysis is required so that the
compiler knows the following things at each point in the program:

 Which variables are guaranteed to have constant values


 Which variables will be used before being redefined

Data flow analysis is used to discover this kind of property. The data flow analysis
can be performed on the program's control flow graph (CFG).

The control flow graph of a program is used to determine those parts of a program

to which a particular value assigned to a variable might propagate .


Peephole optimization
Peephole optimization in Compiler Design is a technique performed on compiler-
generated instructions.

Before we dive into the discussion of the peephole, let's understand how a program
is executed. When a program runs, source code first gets compiled to bytecode.
Source code is the code written by the user( generally in a high-level language
such as python, c++, etc.). Whereas bytecode is machine code(in the form of 0 and
1) that the machine can easily understand. This byte code is generally executed by
a language compiler and contains an optimized and faster source code version. The
code’s performance can be improved by various program transformation
techniques, making the program consume fewer resources and deliver high speed.

Peephole optimization is an optimization technique by which code is optimized to


improve the machine's performance. More formally,

What is Peephole Optimization in Compiler Design?

Peephole optimization is an optimization technique performed on a small set of


compiler-generated instructions; the small set is known as the peephole
optimization in compiler design or window.

Some important aspects regarding peephole optimization:

53.It is applied to the source code after it has been converted to the target code.
54.
55.Peephole optimization comes under machine-dependent optimization.
Machine-dependent optimization occurs after the target code has been
generated and transformed to fit the target machine architecture. It makes
use of CPU registers and may make use of absolute memory references
rather than relative memory references.
56.
57.It is applied to a small piece of code, repeatedly.
58.

The objectives of peephole optimization are as follows:

59.Improve performance
60.Reduce memory footprint
61.Reduce code size.
62.

Objectives of Peephole Optimization in Compiler Design

The objectives of Peephole optimization in compiler design are:


 It makes the generated machine code smaller, improving cache usage and
saving memory.
 Improve the performance of instructions arranged to make the program run
faster.
 Get rid of operations that are not needed or are repeated to make things work
better and smoother.
 It calculates fixed values in advance and substitutes variables with already
known numbers.
 Improve how choices are made to control the program's flow more
effectively.
 Replace slower instructions with quicker options for better performance.
 Optimize memory operations for better data handling.
 Use specific hardware features for better performance on the target platform.

Working of Peephole Optimization in Compiler design

There are mainly four steps in Peephole Optimization in Compiler Design. The
steps are as follows:

Identification

 The first step says that you must identify the code section where you need
the Peephole Optimization.

 Peephole is an instruction with a fixed window size, so the window size
depends on the specific optimization being performed.

 The compiler helps to define the instructions within the window.

Optimization

 In the next step, you must apply the rules of optimizations pre-defined in the
Peephole.

 The compiler will search for the specific pattern of instructions in the
window.

 There can be many types of patterns, such as insufficient code, series of
loads and stores or complex patterns like branches.

Analysis

 After the pattern is identified, the compiler will make the changes in the
instructions.

 Now the compiler will cross-check the codes to determine whether the
changes improved the code.

 It will check the improvement based on size, speed and memory usage.

Iteration

 The above steps will go on a loop by finding the Peephole repeatedly until
no more optimisation is left in the code.

 The compiler will go to each instruction one at a time and make the changes
and reanalyse it for the best result.

Peephole Optimization Techniques

There are various peephole optimization techniques.

Redundant Load and Store

In this optimization, the redundant operations are removed. For example, loading
and storing values on registers can be optimized.

For example,

a= b+c

d= a+e

It is implemented on the register(R0) as


MOV b, R0; instruction to copy b to the register
ADD c, R0; instruction to Add c to the register, the
register is now b+c
MOV R0, a; instruction to Copy the register(b+c) to a
MOV a, R0; instruction to Copy a to the register
ADD e, R0 ;instruction to Add e to the register, the
register is now a(b+c)+e
MOV R0, d; instruction to Copy the register to d

This can be optimized by removing load and store operation, like in third
instruction value in register R0 is copied to a, and it again loaded to R0 in the next
step for further operation. The optimized implementation will be:

MOV b, R0; instruction to Copy b to the register


ADD c, R0; instruction to Add c to the register, which is
now b+c (a)
MOV R0, a; instruction to Copy the register to a
ADD e, R0; instruction to Add e to the register, which is
now b+c+e [(a)+e]
MOV R0, d; instruction to Copy the register to d

Strength Reduction

In strength reduction optimization, operators that consume higher execution time


are replaced by the operators consuming less execution time. Like multiplication
and division, operators can be replaced by shift operators.

Initial code:
n = a * 2;
Optimized code:
b= a << 1;
//left shifting the bit
Initial code:
b = a / 2;
Optimized code:
b = a >> 1;
// right shifting the bit by one will give the same result

Simply Algebraic Expressions

The algebraic expressions that are useless or written inefficiently are transformed.

For example:

a=a+0
a=a*1
a=a/1
a=a-0
//All these above expression are causing calculation
overhead.
// These can be removed for optimization

Replace Slower Instructions With Faster

Slower instructions can be replaced with faster ones, and registers play an
important role. For example, a register supporting unit increment operation will
perform better than adding one to the register. The same can be done with many
other operations, like multiplication.

Add #1
SUB #1
//The above instruction can be replaced with
// INC R
// DEC R
//If the register supports increment and decrement

Let’s see another example of Java bytecode:

Here X is loaded on ‘a’ twice and then multiplied. We can use dup function, it will
copy the value on the top of the stack( ‘X’ need not be loaded again), and then we
can perform our operation. It works faster and can be preferred over slower
operations.
a load X
a load X
Mul
// The above instructions can be replaced with
a load X
dup
Mul

Dead code Elimination

The dead code can be eliminated to improve the system's performance; resources
will be free and less memory will be needed.

int dead(void)
{
int a=1;
int b=5;
int c=a+b;
return c;
// c will be returned
// The remaining part of code is dead code, never reachable
int k=1;
k=k*2;
k=k+b;
return k;
// This dead code can be removed for optimization
}

Moreover, null sequences and user less operations can be deleted too.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy