0% found this document useful (0 votes)
25 views48 pages

Heap Sort

Uploaded by

Muhammad Yasir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views48 pages

Heap Sort

Uploaded by

Muhammad Yasir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 48

Design and Analysis of

Algorithms
(2.3,4.1,4.2,6.1,6.2)
Introduction to heapsort
Sorting Revisited
● So far we’ve talked about two algorithms to
sort an array of numbers
■ What is the advantage of merge sort?
■ What is the advantage of insertion sort?
● Next on the agenda: Heapsort
■ Combines advantages of both previous algorithms
● Like merge sort, but unlike insertion sort, heapsort’s
running time is O(n lg n).
● Like insertion sort, but unlike merge sort, heapsort
sorts in place: only a constant number of array
elements are stored outside the input array at any
time.
● Thus, heapsort combines the better attributes of the
two sorting algorithms we have already discussed .
Heaps
● A heap can be seen as a complete binary tree:

16

14 10

8 7 9 3

2 4 1

■ What makes a binary tree complete?


■ Is the example above complete?
● A binary tree that is completely filled, with the
possible exception of the bottom level, which
is filled from left to right, is called a complete
binary tree
Heaps
● A heap can be seen as a complete binary tree:

16

14 10

8 7 9 3

2 4 1 1 1 1 1 1

■ The book calls them “nearly complete” binary trees;


can think of unfilled slots as null pointers
Heaps
● In practice, heaps are usually implemented as
arrays:
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Heaps
● To represent a complete binary tree as an array:
■ The root node is A[1]
■ Node i is A[i]
■ The parent of node i is A[i/2] (note: integer divide)
■ The left child of node i is A[2i]
■ The right child of node i is A[2i + 1]
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Referencing Heap Elements
● So…
Parent(i) { return i/2; }
Left(i) { return 2*i; }
right(i) { return 2*i + 1; }
● There are two kinds of binary heaps:
● max-heaps and min-heaps.
The Heap Property
● Heaps also satisfy the heap property:
● In a max-heap, the max-heap property is that
for every node i, other than the root,
A[Parent(i)]  A[i] for all nodes i > 1
■ In other words, the value of a node is at most the
value of its parent
■ Where is the largest element in a heap stored?
The Heap Property
● A min-heap is organized in the opposite way; the
min-heap property is that for every node i other than
the root,
● A[PARENT(i )] ≤ A[i ] .
● The smallest element in a min-heap is at the root.
● Definitions:
■ The height of a node in the tree = the number of
edges on the longest downward path to a leaf
■ The height of a tree = the height of its root
Heap Height
● What is the height of an n-element heap?
Why?
● we define the height of the heap to be the
height of its root.
● Since a heap of n elements is based on a
complete binary tree, its height is θ (lg n)
Heap Operations: MAX_Heapify()
● MAX_Heapify(): maintain the heap property
■ Given: a node i in the heap with children l and r
■ Given: two subtrees rooted at l and r, assumed to be
heaps
■ Problem: The subtree rooted at i may violate the
heap property (How?)
■ Action: let the value of the parent node “float
down” so subtree at i satisfies the heap property
○ What do you suppose will be the basic operation between
i, l, and r?
Heap Operations: MAX_Heapify()
MAX_Heapify(A, i)
{
l = Left(i); r = Right(i);
if (l <= heap_size(A) && A[l] > A[i])
largest = l;
else
largest = i;
if (r <= heap_size(A) && A[r] > A[largest])
largest = r;
if (largest != i)
Swap(A, i, largest);
MAX_Heapify(A, largest);
}
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Analyzing MAX_Heapify(): Informal
● Aside from the recursive call, what is the
running time of MAX_Heapify()?
● How many times can MAX-Heapify()
recursively call itself?
● What is the worst-case running time of
MAX_Heapify() on a heap of size n?
Analyzing Heapify(): Formal
● Fixing up relationships between i, l, and r
takes (1) time
● If the heap at i has n elements, how many
elements can the subtrees at l or r have?
■ Draw it
● Answer: 2n/3
● So time taken by MAX_Heapify() is given
by T(n)  T(2n/3) + (1)
Analyzing MAX_Heapify(): Formal
● So we have
T(n)  T(2n/3) + (1)
● By the Master Theorem,
T(n) = O(lg n)
● Thus, MAX_Heapify() takes linear time
Heap Operations: BuildHeap()
● We can build a heap in a bottom-up manner by
running MAX-Heapify() on successive
subarrays
■ Fact: for array of length n, all elements in range
A[(n/2 + 1) .. n] are heaps (Why?)
○ elements in the subarray A[(n/2+1) . . n] are all leaves
of the tree, and so each is a 1-element heap to begin
with.
Heap Operations: BuildHeap()

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heap Operations: BuildHeap()
■ So:
○ Walk backwards through the array from n/2 to 1, calling
MAX_Heapify() on each node.
○ i.e. The procedure BuildHeap goes through the
remaining nodes of the tree and runs MAX_HEAPIFY
on each one
○ Order of processing guarantees that the children of node
i are heaps when i is processed
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)
MAX_Heapify(A, i);
}
BuildHeap() Example
● Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1 3

2 16 9 10

14 8 7
Analyzing BuildHeap()
● Each call to MAX_Heapify() takes O(lg n)
time
● There are O(n) such calls (specifically, n/2)
● Thus the running time is O(n lg n)
■ Is this a correct asymptotic upper bound?
■ Is this an asymptotically tight bound?
● A tighter bound is O(n)
■ How can this be? Is there a flaw in the above
reasoning?
Heapsort
● Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
■ Maximum element is at A[1]
■ Discard by swapping with element at A[n]
○ Decrement heap_size[A]
○ A[n] now contains correct value
■ Restore heap property at A[1] by calling
MAX_Heapify()
■ Repeat, always swapping A[1] for A[heap_size(A)]
Heapsort
● The heapsort algorithm starts by using BUILD-HEAP to build
a max-heap on the input array A[1 . . n], where n = length[A].
● Since the maximum element of the array is stored at the root
A[1], it can be put into its correct final position by exchanging
it with A[n].
● If we now “discard” node n from the heap (by decrementing
heap-size[A]), we observe that A[1 . . (n − 1)] can easily be
made into a max-heap.
● The children of the root remain max-heaps, but the new root
element may violate the max-heap property.
Heapsort
● All that is needed to restore the max-heap property, however,
is one call to MAX-HEAPIFY(A, 1), which leaves a max-heap
in A[1 . . (n − 1)].
● The heapsort algorithm then repeats this process for the
maxheap of size n − 1 down to a heap of size 2.
Heapsort
Heapsort(A)
{
BuildHeap(A);
for (i = length(A) downto 2)
{
Swap(A[1], A[i]);
heap_size(A) -= 1;
MAX_Heapify(A, 1);
}
}
Analyzing Heapsort
● The call to BuildHeap() takes O(n) time
● Each of the n - 1 calls to MAX_Heapify()
takes O(lg n) time
● Thus the total time taken by HeapSort()
= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)
Priority Queues
● Heapsort is a nice algorithm, but in practice
Quicksort (coming up) usually wins
● Application: But the heap data structure is
incredibly useful for implementing priority
queues
● As with heaps, there are two kinds of priority
queues: max-priority queues and min-priority
queues.
○ We will focus here on how to implement max-priority
queues, which are in turn based on max-heaps
Priority Queues
■ A Priority Queue is a data structure for
maintaining a set S of elements, each with an
associated value or key
■ A max-priority queue supports the following
operations.
■ Supports the operations Insert(),
Maximum(), and ExtractMax()
■ What might a priority queue be useful for?
Priority Queues
● One application of max-priority queues is to
schedule jobs on a shared computer. The max-
priority queue keeps track of the jobs to be
performed and their relative priorities. When a
job is finished or interrupted, the highest-
priority job is selected from those pending
using EXTRACT-MAX. A new job can be
added to the queue at any time using INSERT.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy