Adsa-4 Unit

MC4101
UNIT IV ALGORITHM DESIGN TECHNIQUES
Dynamic Programming:
Elements of Dynamic Programming

MC4101
MC4101
Greedy Algorithms
 An algorithm is designed to achieve optimum solution for a given

problem.
 In greedy algorithm approach, decisions are made from the given

solution domain.
 As being greedy, the closest solution that seems to provide an optimum
solution is chosen.
 Greedy algorithms try to find a localized optimum solution, which may
eventually lead to globally optimized solutions.
 However, generally greedy algorithms do not provide globally optimized
solutions.
 Greedy is an algorithmic paradigm that builds up a solution piece by
piece, always choosing the next piece that offers the most obvious and
immediate benefit.
 So the problems where choosing locally optimal also leads to global
solution are best fit for Greedy.
Advantages of Greedy Approach/Technique
 This technique is easy to formulate and implement.

 It works efficiently in many scenarios.
 This approach minimizes the time required for generating the
solution.
MC4101
Disadvantages of Greedy Approach/Technique
 This approach does not guarantee a global optimal solution since it

never looks back at the choices made for finding the local optimal
solution.
Greedy Algorithm
1. To begin with, the solution set (containing answers) is empty.

2. At each step, an item is added into the solution set.
3. If the solution set is feasible, the current item is kept.
4. Else, the item is rejected and never considered again.
Examples
Most networking algorithms use the greedy approach. Here is a list of few of
them −
 Travelling Salesman Problem
 Prim's Minimal Spanning Tree Algorithm
 Kruskal's Minimal Spanning Tree Algorithm
 Dijkstra's Minimal Spanning Tree Algorithm
 Graph - Map Coloring
 Graph - Vertex Cover
 Knapsack Problem
 Job Scheduling Problem
There are lots of similar problems that uses the greedy approach to find an
optimum solution.
Algorithm of Greedy method
1. Algorithm Greedy(a, n)
2. {
3. for i = 1 to n do
4. {
5. x = Select(a);
6. if feasible(x) then
MC4101
7. solution = solution + x;
8. }
9. }
Elements of the Greedy Strategy
The components that can be used in the greedy algorithm are:
o Candidate set: A solution that is created from the set is known as a

candidate set.
o Selection function: This function is used to choose the candidate or
subset which can be added in the solution.
o Feasibility function: A function that is used to determine whether the
candidate or subset can be used to contribute to the solution or not.
o Objective function: A function is used to assign the value to the solution
or the partial solution.
o Solution function: This function is used to intimate whether the
complete function has been reached or not.
Applications of Greedy Algorithm
o It is used in finding the shortest path.

o It is used to find the minimum spanning tree using the prim's algorithm
or the Kruskal's algorithm.
o It is used in a job sequencing with a deadline.
o This algorithm is also used to solve the fractional knapsack problem.
Activity Selection Problem
What is Activity Selection Problem?
Let's consider that you have n activities with their start and finish times, the
objective is to find solution set having maximum number of non-conflicting
activities that can be executed in a single time frame, assuming that only one
person or machine is available for execution.
MC4101
Some points to note here:
 It might not be possible to complete all the activities, since their timings
can collapse.
 Two activities, say i and j, are said to be non-conflicting if si >= fj or sj

>= fi where si and sj denote the starting time of
activities i and j respectively, and fi and fj refer to the finishing time of
the activities i and j respectively.
 Greedy approach can be used to find the solution since we want to

maximize the count of activities that can be executed. This approach will
greedily choose an activity with earliest finish time at every step, thus
yielding an optimal solution.
Input Data for the Algorithm:
 act[] array containing all the activities.
 s[] array containing the starting time of all the activities.
 f[] array containing the finishing time of all the activities.
Ouput Data from the Algorithm:
 sol[] array refering to the solution set containing the maximum number
of non-conflicting activities.
Steps for Activity Selection Problem
Following are the steps we will be following to solve the activity selection
problem,
Step 1: Sort the given activities in ascending order according to their finishing
time.
Step 2: Select the first activity from sorted array act[] and add it to sol[] array.
MC4101
Step 3: Repeat steps 4 and 5 for the remaining activities in act[].
Step 4: If the start time of the currently selected activity is greater than or
equal to the finish time of previously selected activity, then add it to
the sol[] array.
Step 5: Select the next activity in act[] array.
Step 6: Print the sol[] array.
Algorithm Of Greedy- Activity Selector:
GREEDY- ACTIVITY SELECTOR (s, f)

1. n ← length [s]
2. A ← {1}
3. j ← 1.
4. for i ← 2 to n
5. do if si ≥ fi
6. then A ← A ∪ {i}
7. j ← i
8. return A
Example: Given 6 activities along with their start and end time as
MC4101
 Step 2: Select the first activity from sorted array act[] and add it to
the sol[] array, thus sol = {a2}.
 Step 3: Repeat the steps 4 and 5 for the remaining activities in act[].
 Step 4: If the start time of the currently selected activity is greater than
or equal to the finish time of the previously selected activity, then add it
to sol[].
 Step 5: Select the next activity in act[]
 For the data given in the above table,
 Select activity a3. Since the start time of a3 is greater than the finish
time of a2 (i.e. s(a3) > f(a2)), we add a3 to the solution set. Thus sol =
{a2, a3}.
 Select a4. Since s(a4) < f(a3), it is not added to the solution set.
 Select a5. Since s(a5) > f(a3), a5 gets added to solution set. Thus sol =
{a2, a3, a5}
 Select a1. Since s(a1) < f(a5), a1 is not added to the solution set.
 Select a6. a6 is added to the solution set since s(a6) > f(a5). Thus sol =
{a2, a3, a5, a6}.
MC4101
 Step 6: At last, print the array sol[]

 Hence, the execution schedule of maximum number of non-conflicting
activities will be:
(1,2)
(3,4)
(5,7)
(8,9)
In the above diagram, the selected activities have been highlighted in grey.
MC4101
Huffman Coding.
 Huffman Coding is a famous Greedy Algorithm.

 It is used for the lossless compression of data.
 It uses variable length encoding.
 It assigns variable length code to all the characters.
 The code length of a character depends on how frequently it occurs in the
given text.
 The character which occurs most frequently gets the smallest code.
 The character which occurs least frequently gets the largest code.
 It is also known as Huffman Encoding.
Prefix Rule-
 Huffman Coding implements a rule known as a prefix rule.
 This is to prevent the ambiguities while decoding.
 It ensures that the code assigned to any character is not a prefix of the code
assigned to any other character.
Major Steps in Huffman Coding-
There are two major steps in Huffman Coding-
1. Building a Huffman Tree from the input characters.

2. Assigning code to the characters by traversing the Huffman Tree.
Huffman Tree-
The steps involved in the construction of Huffman Tree are as follows-
Step-01:
 Create a leaf node for each character of the text.
 Leaf node of a character contains the occurring frequency of that character.
Step-02:
 Arrange all the nodes in increasing order of their frequency value.
Step-03:
Considering the first two nodes having minimum frequency,
 Create a new internal node.

 The frequency of this new node is the sum of frequency of those two nodes.
MC4101
 Make the first node as a left child and the other node as a right child of the
newly created node.
Step-04:
 Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
 The tree finally obtained is the desired Huffman Tree.
Time Complexity-
The time complexity analysis of Huffman Coding is as follows-

 extractMin( ) is called 2 x (n-1) times if there are n nodes.
 As extractMin( ) calls minHeapify( ), it takes O(logn) time.
Thus, Overall time complexity of Huffman Coding becomes O(nlogn).

Here, n is the number of unique characters in the given text.
Important Formulas-
The following 2 formulas are important to solve the problems based on
Huffman Coding-
Formula-01:
Formula-02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per
character
= ∑ ( frequencyi x Code lengthi )
PRACTICE PROBLEM BASED ON HUFFMAN CODING-

Problem-
MC4101
A file contains the following characters with the frequencies as shown. If

Huffman Coding is used for data compression, determine-
1. Huffman Code for each character
2. Average code length
3. Length of Huffman encoded message (in bits)
Characters Frequencies
a 10
e 15
i 12
o 3
u 4
s 13
t 1
Solution-
 First let us construct the Huffman Tree.
 Huffman Tree is constructed in the following steps-
MC4101
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
MC4101
Step-06:
Step-07:
MC4101
Now,
 We assign weight to all the edges of the constructed Huffman Tree.
 Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.
Rule
 If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right
edges.
 If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right
edges.
 Any of the above two conventions may be followed.
MC4101
 But follow the same convention at the time of decoding that is adopted at the
time of encoding.
After assigning weight to all the edges, the modified Huffman Tree is-
Now, let us answer each part of the given problem one by one-
1. Huffman Code For Characters-
To write Huffman Code for any character, traverse the Huffman Tree from root
node to the leaf node of that character.
Following this rule, the Huffman Code for each character is-
MC4101
 a = 111
 e = 10
 i = 00
 o = 11001
 u = 1101
 s = 01
 t = 11000
From here, we can observe-

 Characters occurring less frequently in the text are assigned the larger code.
 Characters occurring more frequently in the text are assigned the smaller
code.
2. Average Code Length-
Using formula-01, we have-

Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 +
12 + 3 + 4 + 13 + 1)
= 2.52
3. Length of Huffman Encoded Message-
Using formula-02, we have-

Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per
character
= 58 x 2.52
= 146.16
≅ 147 bits
MC4101
Dynamic programming Greedy method
Definition It is used to obtain the optimum It is also used to obtain the optimum
solution. solution.
Feasibility There is no special set of feasible In a greedy method, the optimum

solutions in this method. solution is obtained from the feasible
set of solutions.
Recursion Dynamic programming considers In the greedy method, the optimum

all the possible sequences in solution is obtained without revising
order to obtain the optimum the previously generated solutions.
solution.
Principle of It guarantees that it will generate It does not guarantee that it will
optimality the optimum solution using the generate the optimum solution.
principle of optimality.
Memoization It creates the lookup table to store It is more efficient in terms of memory
the results of the subproblems as it does not create any table to store
that occupy the memory space. the previous states.
Time Dynamic programming is slower Greedy methods are faster than

Complexity than the greedy method, like dynamic programming like Dijkstra's
Bellman-Ford algorithm takes shortest path algorithm takes (ElogV +
O(VE) time. VlogV) time.
Method The dynamic programming uses The greedy method always computes
the bottom-up or top-down the solution in a sequence manner,
approach by breaking down a and it does not look at the previous
complex problem into simpler states.
problems.
Example 0/1 knapsack problem Fractional knapsack
Longest Common Subsequence
 The longest common subsequence (LCS) is defined as the

longest subsequence that is common to all the given
sequences, provided that the elements of the subsequence are
not required to occupy consecutive positions within the original
sequences.
MC4101
 If S1 and S2 are the two given sequences then, Z is the common

subsequence of S1 and S2 if Z is a subsequence of
both S1 and S2 . Furthermore, Z must be a strictly increasing
sequence of the indices of both S1 and S2 .
 In a strictly increasing sequence, the indices of the elements

chosen from the original sequences must be in ascending order
in Z .
 If
S1 = {B, C, D, A, A, C, D}
Then, {A, D, B} cannot be a subsequence of S1 as the order of the elements is

not the same (ie. not strictly increasing sequence).
Let us understand LCS with an example.
If
S1 = {B, C, D, A, A, C, D}
S2 = {A, C, D, B, A, C}
Then, common subsequences are {B, C}, {C, D, A, C}, {D, A, C}, {A, A, C}, {A,
C}, {C, D}, ...
Among these subsequences, {C, D, A, C} is the longest common
subsequence. We are going to find this longest common subsequence using
dynamic programming.
Before proceeding further, if you do not already know about dynamic
programming, please go through dynamic programming.
MC4101
Using Dynamic Programming to find the LCS
Let us take two sequences:
The first sequence
Second Sequence
The following steps are followed for finding the longest common
subsequence.
1. Create a table of dimension n+1*m+1 where n and m are the lengths

of X and Y respectively. The first row and the first column are filled with zeros.
Initialise a table
2. Fill each cell of the table using the following logic.
3. If the character corresponding to the current row and current column are
matching, then fill the current cell by adding one to the diagonal element. Point
an arrow to the diagonal cell.
MC4101
4. Else take the maximum value from the previous column and previous row
element for filling the current cell. Point an arrow to the cell with maximum
value. If they are equal, point to any of them.
Fill the values
5. Step 2 is repeated until the table is filled.
Fill all the values

MC4101
6. The value in the last row and the last column is the length of the longest
common subsequence. The bottom right corner is the length of the LCS
7. In order to find the longest common subsequence, start from the last element
and follow the direction of the arrow. The elements corresponding to () symbol
form the longest common subsequence.
Create a path according to the arrows

MC4101
Thus, the longest common subsequence is CA.
LCS
How is a dynamic programming algorithm more efficient than the recursive

algorithm while solving an LCS problem?
 The method of dynamic programming reduces the number of function
calls. It stores the result of each function call so that it can be used in
future calls without the need for redundant calls.
 In the above dynamic algorithm, the results obtained from each

comparison between elements of X and the elements of Y are stored in
a table so that they can be used in future computations.
 So, the time taken by a dynamic approach is the time taken to fill the
table (ie. O(mn)). Whereas, the recursion algorithm has the complexity
of 2max(m, n) .
Longest Common Subsequence Algorithm
X and Y be two given sequences
Initialize a table LCS of dimension X.length * Y.length
X.label = X
Y.label = Y
LCS[0][] = 0
LCS[][0] = 0
Start from LCS[1][1]
Compare X[i] and Y[j]
If X[i] = Y[j]
LCS[i][j] = 1 + LCS[i-1, j-1]
Point an arrow to LCS[i][j]

MC4101
Else
LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
Point an arrow to max(LCS[i-1][j], LCS[i][j-1])

Adsa-4 Unit

Uploaded by

Copyright:

Available Formats

Adsa-4 Unit

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adsa-4 Unit

Uploaded by

Copyright:

Available Formats

MC4101

UNIT IV ALGORITHM DESIGN TECHNIQUES

Elements of Dynamic Programming

 An algorithm is designed to achieve optimum solution for a given

 In greedy algorithm approach, decisions are made from the given

Advantages of Greedy Approach/Technique

 This technique is easy to formulate and implement.

Disadvantages of Greedy Approach/Technique

 This approach does not guarantee a global optimal solution since it

1. To begin with, the solution set (containing answers) is empty.

 Travelling Salesman Problem

 Prim's Minimal Spanning Tree Algorithm

 Kruskal's Minimal Spanning Tree Algorithm

 Dijkstra's Minimal Spanning Tree Algorithm

 Graph - Map Coloring

 Graph - Vertex Cover

 Job Scheduling Problem

Algorithm of Greedy method

Elements of the Greedy Strategy

The components that can be used in the greedy algorithm are:

o Candidate set: A solution that is created from the set is known as a

Applications of Greedy Algorithm

o It is used in finding the shortest path.

Activity Selection Problem

What is Activity Selection Problem?

Some points to note here:

 Two activities, say i and j, are said to be non-conflicting if si >= fj or sj

 Greedy approach can be used to find the solution since we want to

Input Data for the Algorithm:

 act[] array containing all the activities.

 s[] array containing the starting time of all the activities.

 f[] array containing the finishing time of all the activities.

Ouput Data from the Algorithm:

Steps for Activity Selection Problem

Step 3: Repeat steps 4 and 5 for the remaining activities in act[].

Step 5: Select the next activity in act[] array.

Step 6: Print the sol[] array.

Algorithm Of Greedy- Activity Selector:

GREEDY- ACTIVITY SELECTOR (s, f)

 Step 6: At last, print the array sol[]

 Huffman Coding is a famous Greedy Algorithm.

1. Building a Huffman Tree from the input characters.

 Create a new internal node.

The time complexity analysis of Huffman Coding is as follows-

Thus, Overall time complexity of Huffman Coding becomes O(nlogn).

PRACTICE PROBLEM BASED ON HUFFMAN CODING-

A file contains the following characters with the frequencies as shown. If

1. Huffman Code For Characters-

From here, we can observe-

2. Average Code Length-

Using formula-01, we have-

3. Length of Huffman Encoded Message-

Using formula-02, we have-

Dynamic programming Greedy method

Feasibility There is no special set of feasible In a greedy method, the optimum

Recursion Dynamic programming considers In the greedy method, the optimum

Time Dynamic programming is slower Greedy methods are faster than

Example 0/1 knapsack problem Fractional knapsack

Longest Common Subsequence