Unit 1 - 2

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14

DIVIDE AND CONQUER ALGORITHM

 In this approach ,we solve a problem recursively by applying 3 steps


1. DIVIDE-break the problem into several sub problems of smaller size.
2. CONQUER-solve the problem recursively.
3. COMBINE-combine these solutions to create a solution to the original problem.

CONTROL ABSTRACTION FOR DIVIDE AND CONQUER ALGORITHM

Algorithm D and C (P)


{
if small(P)
then return S(P)
else

{ divide P into smaller instances P1 ,P2 .....Pk


Apply D and C to each sub problem
Return combine (D and C(P1)+ D and C(P2)+.......+D and C(Pk))
}
}

Let a recurrence relation is expressed as T(n)=


ϴ(1), if n<=C
aT(n/b) + D(n)+ C(n) ,otherwise
then n=input size a=no. Of sub-problemsn/b= input size of the sub-problems

Lecture 7: Worst case analysis of merge sort, quick sort

Merge sort

It is one of the well-known divide-and-conquer algorithm. This is a simple and very


efficient algorithm for sorting a list of numbers.

We are given a sequence of n numberswhich we will assume is stored in an array A


[1...n]. Theobjective is to output a permutation of this sequence, sorted in increasing
order. This is normally done by permuting the elements within the array A.

How can we apply divide-and-conquer to sorting? Here are the major elements of the
Merge Sort algorithm.

Divide: Split A down the middle into two sub-sequences, each of size

roughly n/2 . Conquer: Sort each subsequence (by calling MergeSort

recursively on each).

Combine: Merge the two sorted sub-sequences into a single sorted list.

The dividing process ends when we have split the sub-sequences down to a single item.
A sequence of length one is trivially sorted. The key operation where all the work is done
is in the combine stage,which merges together two sorted lists into a single sorted list. It
turns out that the merging process is quite easy to implement.
The following figure gives a high-level view of the algorithm. The “divide” phase is shown
on the left. It works top-down splitting up the list into smaller sublists. The “conquer and
combine” phases areshown on the right. They work bottom-up, merging sorted lists
together into larger sorted lists.

Merge Sort

Designing the Merge Sort algorithm top-down. We’ll assume that the procedure
thatmerges two sortedlist is available to us. We’ll implement it later. Because the
algorithm is called recursively on sublists,in addition to passing in the array itself, we will
pass in two indices, which indicate the first and lastindices of the subarray that we are to
sort. The call MergeSort(A, p, r) will sort the sub-arrayA [ p..r ] and return the sorted result
in the same subarray.

Here is the overview. If r = p, then this means that there is only one element to sort, and
we may returnimmediately. Otherwise (if p < r) there are at least two elements, and we
will invoke the divide-and-conquer. We find the index q, midway between p and r, namely
q = ( p + r ) / 2 (rounded down to thenearest integer). Then we split the array into
subarrays A [ p..q ] and A [ q
+ 1 ..r ] . Call Merge Sort recursively to sort each subarray. Finally, we invoke a procedure
(which we have yet to write) whichmerges these two subarrays into a single sorted array.

MergeSort(array A, int p, int r) {


if (p < r) { // we have at least 2
items q = (p + r)/2
MergeSort(A, p, q) // sort A[p..q]
MergeSort(A, q+1, r) // sort A[q+1..r]
Merge(A, p, q, r) // merge everything together
} }

Merging: All that is left is to describe the procedure that merges two sorted lists. Merge(A,
p, q, r)assumes that the left subarray, A [ p..q ] , and the right subarray, A [ q + 1 ..r ] ,
have already been sorted.We merge these two subarrays by copying the elements to a
temporary working array called B. Forconvenience, we will assume that the array B has
the same index range A, that is, B [ p..r ] . We have to indices i and j, that point to the
current elements ofeach subarray. We move the smaller element into the next position of
B (indicated by index k) andthen increment the corresponding index (either i or j). When
we run out of elements in one array, thenwe just copy the rest of the other array into B.
Finally, we copy the entire contents of B back into A.

Merge(array A, int p, int q, int r) { // merges A[p..q] with


A[q+1..r] array B[p..r]
i=k=p //initialize pointers
j = q+1
while (i <= q and j <= r) { // while both subarrays are nonempty
if (A[i] <= A[j]) B[k++] = A[i++] // copy from left subarray
else B[k++] = A[j++] // copy from right subarray
}
while (i <= q) B[k++] = A[i++] // copy any leftover to
B while (j <= r) B[k++] = A[j++]
for i = p to r do A[i] = B[i] // copy B back to A }

Analysis: What remains is to analyze the running time of MergeSort. First let us consider
the running timeof the procedure Merge(A, p, q, r). Let n = r − p + 1 denote the total
length of both the leftand right subarrays. What is the running time of Merge as a function
of n? The algorithm contains fourloops (none nested in the other). It is easy to see that
each loop can be executed at most n times. (Ifyou are a bit more careful you can actually
see that all the while-loops
together can only be executed ntimes in total, because each execution copies one new
element to the array B, and B only has space forn elements.) Thus the running time to
Merge n items is Θ ( n ) . Let us write this without the asymptoticnotation, simply as n.
(We’ll see later why we do this.)

Now, how do we describe the running time of the entire MergeSort algorithm? We will do
this throughthe use of a recurrence, that is, a function that is defined recursively in terms
of itself. To avoidcircularity, the recurrence for a given value of n is defined in terms of
values that are strictly smallerthan n. Finally, a recurrence has some basis values (e.g. for
n = 1 ), which are defined explicitly.

Let’s see how to apply this to MergeSort. Let T ( n ) denote the worst case running time of
MergeSort onan array of length n. For concreteness we could count whatever we like:
number of lines of pseudocode,number of comparisons, number of array accesses, since
these will only differ by a constant factor.Since all of the real work is done in the Merge
procedure, we will count the total time spent in theMerge procedure.

First observe that if we call MergeSort with a list containing a single element, then the
running time is aconstant. Since we are ignoring constant factors, we can just write T ( n )
=1 . When we call MergeSortwith a list of length n >1 , e.g. Merge(A, p, r), where r − p +1
= n, the algorithm first computes q = ( p + r ) / 2 . The subarray A [ p..q ] , which contains
q − p + 1 elements. You can verify that is of size n/ 2 . Thus the remaining subarray A [ q
+1 ..r ] has n/ 2 elements in it. How long does it taketo sort the left subarray? We do not
know this, but because n/ 2< n for n >1 , we can express this as T (n/ 2) . Similarly, we
can express the time that it takes to sort the right subarray as T (n/ 2).

Finally, to merge both sorted lists takes n time, by the comments made above. In
conclusion we have

T ( n ) =1 if n = 1 ,

2T (n/ 2) + n otherwise.

Solving the above recurrence we can see that merge sort has a time complexity of Θ (n log n) .
QUICKSORT

 Worst-case running time: O (n2).


 Expected running time: O (n lgn).
 Sorts in place.

Description of quicksort
Quicksort is based on the three-step process of divide-and-conquer.
• To sort the subarrayA[p . . r ]:
Divide: Partition A[p . . r ], into two (possibly empty) subarraysA[p . . q − 1] and
A[q + 1 . . r ], such that each element in the ÞrstsubarrayA[p . . q − 1] is ≤ A[q] and
A[q] is ≤ each element in the second subarrayA[q + 1 . . r ].
Conquer: Sort the two subarrays by recursive calls to QUICKSORT.
Combine: No work is needed to combine the subarrays, because they are sorted in place.
• Perform the divide step by a procedure PARTITION, which returns the index q that marks the
position separating the subarrays.
QUICKSORT (A, p, r)
ifp < r
thenq ←PARTITION(A, p, r )
QUICKSORT (A, p, q − 1)
QUICKSORT (A, q + 1, r)

Initial call is QUICKSORT (A, 1, n)

Partitioning
Partition subarrayA [p . . . r] by the following procedure:
PARTITION (A, p, r)
x ← A[r ]
i ← p –1
for j ← p to r –1
do if A[ j ] ≤ x

theni ← i + 1
exchangeA[i ] ↔ A[ j ]
exchangeA[i + 1] ↔ A[r ]
returni + 1

 PARTITION always selects the last element A[r ] in the subarrayA[p . . r ] as the pivot the
element around which to partition.
 As the procedure executes, the array is partitioned into four regions, some of which may be
empty:

Performance of Quicksort
The running time of Quicksort depends on the partitioning of the subarrays:
• If the subarrays are balanced, then Quicksort can run as fast as mergesort.
• If they are unbalanced, then Quicksort can run as slowly as insertion sort.

Worst case
• Occurs when the subarrays are completely unbalanced.
• Have 0 elements in one subarray and n − 1 elements in the other subarray.
• Get the recurrence
T (n) = T (n − 1) + T (0) + Θ (n)
= T (n − 1) + Θ (n)
= O (n2) .
• Same running time as insertion sort.
• In fact, the worst-case running time occurs when Quicksort takes a sorted array as input, but
insertion sort runs in O(n) time in this case.

Best case
• Occurs when the subarrays are completely balanced every time.
• Each subarray has ≤ n/2 elements.
• Get the recurrence
T (n) = 2T (n/2) + Θ (n) = O(n lgn).

Balanced partitioning
• QuickPort’s average running time is much closer to the best case than to the worst case.
• Imagine that PARTITION always produces a 9-to-1 split.
• Get the recurrence
T (n) ≤ T (9n/10) + T (n/10) + _ (n) = O (n lgn).
• Intuition: look at the recursion tree.
• It’s like the one for T (n) = T (n/3) + T (2n/3) + O (n).
• Except that here the constants are different; we get log10 n full levels and log10/9 n levels that
are nonempty.
• As long as it’s a constant, the base of the log doesn’t matter in asymptotic notation.
• Any split of constant proportionality will yield a recursion tree of depth O (lgn).
Lecture 8 - Heaps and Heap sort

HEAPSORT

 Inplace algorithm
 Running Time: O(n log n)
 Complete Binary Tree

The (binary) heap data structure is an array object that we can view as a nearly complete binary
tree.Each node of the tree corresponds to an element of the array. The tree is completely filled on
all levels except possibly the lowest, which is filled from the left up to a point.

The root of the tree is A[1], and given the index i of a node, we can easily compute the indices of
its parent, left child, and right child:

 PARENT (i) => return [ i / 2 ]


 LEFT (i) => return 2i
 RIGHT (i) => return 2i+ 1

On most computers, the LEFT procedure can compute 2i in one instruction by simply shifting the
binary representation of i left by one bit position.

Similarly, the RIGHT procedure can quickly compute 2i + 1 by shifting the binary
representation of i left by one bit position and then adding in a 1 as the low-order bit.
The PARENT procedure can compute [i/2] by shifting i right one bit position. Good
implementations of heapsort often implement these procedures as "macros" or "inline"
procedures.

There are two kinds of binary heaps: max-heaps and min-heaps.

 In a max-heap,the max-heap property is that for every node i other than the root,
A[PARENT(i)] >= A[i] ,that is, the value of a node is at most the value of its parent. Thus,
the largest element in a max-heap is stored at the root, and the subtree rooted at a node
contains values no larger than that contained at the node itself.
 A min-heap is organized in the opposite way; the min-heap property is that for every
node i other than the root, A[PARENT(i)<=A[i] ,

The smallest element in a min-heap is at the root.

 The height of a node in a heap is the number of edges on the longest simple downward
path from the node to a leaf and
 The height of the heap is the height of its root.
 Height of a heap of n elements which is based on a complete binary tree is O(log n).

Maintaining the heap property

MAX-HEAPIFY lets the value at A[i] "float down" in the max-heap so that the
subtree rooted at index i obeys the max-heap property.

MAX-HEAPIFY(A,i)

1. l  LEFT(i)
2. r RIGHT(i)
3. if A[l] > A[i]
4. largest l
5. if A[r] > A[largest]
6. Largest r
7. if largest != i
8. Then exchange A[i] A[largest]
9. MAX-HEAPIFY(A,largest)

At each step, the largest of the elements A[i], A[LEFT(i)], and A[RIGHT(i)] is determined,
and its index is stored in largest. If A[i] is largest, then the subtree rooted at node i is already a max-
heap and the procedure terminates. Otherwise, one of the two children has the largest element, and
A[i ] is swapped with A[largest], which causes node i and its children to satisfy the max-heap
property. The node indexed by largest, however, now has the original value A[i], and thus the
subtree rooted at largest might violate the max-heap property. Consequently, we call M AX-
HEAPIFY recursively on that subtree.

Figure: The action of MAX-HEAPIFY (A, 2), where heap-size = 10. (a) The initial con-
figuration, with A [2] at node i = 2 violating the max-heap property since it is not larger than
both children. The max-heap property is restored for node 2 in (b) by exchanging A [2] with
A[4], which destroys the max-heap property for node 4. The recursive call MAX-HEAPIFY (A,4)
now has i = 4. After swapping A[4] with A[9], as shown in (c), node 4 is fixed up, and the
recursive call MAX-HEAPIFY(A, 9) yields no further change to the data structure.

The running time of MAX-HEAPIFY by the recurrence can be described as

T (n) < = T (2n/3) + O (1)

The solution to this recurrence is T(n)=O(log n)

Building a heap

Build-Max-Heap(A)

1. for i[n/2] to 1
2. do MAX-HEAPIFY(A,i)

4 1 3 2 1 9 1 1 8 7
We can derive a tighter bound by observing that the time for M AX-HEAPIFY to run at a node varies
with the height of the node in the tree, and the heights of most nodes are small. Our tighter
analysis relies on the properties that an n-element heap has height [log n] and at most [n/2 h+1]
nodes of any height h.

The total cost of BUILD-MAX-HEAP as being bounded is T(n)=O(n)

The HEAPSORT Algorithm

HEAPSORT(A)

1. BUILD MAX-HEAP(A)
2. for i=n to 2
3. exchange A[1] with A[i]
4. MAX-HEAPIFY(A,1)
A 1 2 3 4 7 8 9 10 14 16

TheHEAPSORT procedure takes time O(n log n), since the call to B UILD-MAX- HEAP takes
time O(n) and each of the n - 1 calls to M AX-HEAPIFY takes time O(log n).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy