Introduction To Algorithms: Chapter 6: Heap Sort
Introduction To Algorithms: Chapter 6: Heap Sort
5
Heap sort
Has running time of O(nlgn)
6
Heaps
A heap can be seen as a complete binary tree
The tree is completely filled on all levels except possibly the lowest.
In practice, heaps are usually implemented as arrays
A = 16 14 10 8 7 9 3 2 4 1
7
Referencing Heap Elements
The root node is A[1]
Node i is A[i]
Parent(i)
return i/2
Left(i)
return 2*i
Right(i) 1 2 3 4 5 6 7 8 9 10
return 2*i + 1 16 15 10 8 7 9 3 2 4 1
Level: 3 2 1 0
8
Heaps
Have the following procedures:
The Heapify() procedure, which runs in
O( lg n) time, is the key to maintaining the
max-heap property.
10
Heap Operations: Heapify()
Heapify(): maintain the heap property
Given: a node i in the heap with children L and R
two subtrees rooted at L and R, assumed to be
heaps
Problem: The subtree rooted at i may violate the heap
property (How?)
A[i] may be smaller than its children value
11
Heap Operations: Heapify()
Heapify(A, i)
{
1. L left(i)
2. R right(i)
3. if L heap-size[A] and A[L] > A[i]
4. then largest L
5. else largest i
6. if R heap-size[A] and A[R] > A[largest]
7. then largest R
8. if largest i
9. then exchange A[i] A[largest]
10. Heapify(A, largest)
}
12
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
13
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
14
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
15
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
16
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
17
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
18
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
19
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
20
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
21
Heap Height
Definitions:
The height of a node in the tree = the number of
edges on the longest downward path to a leaf
22
# of nodes in each level
Fact: an n-element heap has at most 2h-k nodes of
level k, where h is the height of the tree
23
Heap Height
A heap storing n keys has height h = lg n = (lg n)
Due to heap being complete, we know:
The maximum # of nodes in a heap of height h
2h + 2h-1 + … + 22 + 21 + 20 =
i=0 to h 2i=(2h+1–1)/(2–1) = 2h+1 - 1
The minimum # of nodes in a heap of height h
1 + 2h-1 + … + 22 + 21 + 20 =
i=0 to h-1 2i + 1 = (2h-1+1–1)/(2–1) + 1 = 2h
Therefore
2h n 2h+1 - 1
h lg n & lg(n+1) – 1 h
lg(n+1) – 1 h lg n
which in turn implies:
h = lg n = (lg n)
24
Analyzing Heapify()
The running time at any given node i is
(1) time to fix up the relationships among A[i],
A[Left(i)] and A[Right(i)]
plus the time to call Heapify recursively on a sub-
tree rooted at one of the children of node I
26
Analyzing Heapify()
So we have
T(n) T(2n/3) + (1)
By case 2 of the Master Theorem,
T(n) = O(lg n)
Alternately, Heapify takes T(n) = Θ(h)
h = height of heap = lg n
T(n) = Θ(lg n)
27
Heap Operations: BuildHeap()
We can build a heap in a bottom-up manner by
running Heapify() on successive subarrays
Fact: for array of length n, all elements in range
A[n/2 + 1, n/2 + 2 .. n] are heaps (Why?)
These elements are leaves, they do not have children
BuildHeap(A)
{
1. heap-size[A] length[A]
2. for i length[A]/2 downto 1
3. do Heapify(A, i)
}
29
BuildHeap() Example
Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
n=10, n/2=5
1
4
2 3
1 3
4 5
6 7
2 16 9 10
8 9 10
14 8 7
30
BuildHeap() Example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
1
4
2 3
1 3
4 i=5 6 7
2 16 9 10
8 9 10
14 8 7
31
BuildHeap() Example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
1
4
2 3
1 3
i=4 5 6 7
2 16 9 10
8 9 10
14 8 7
32
BuildHeap() Example
A = {4, 1, 3, 14, 16, 9, 10, 2, 8, 7}
1
4
2 i=3
1 3
4 5 6 7
14 16 9 10
8 9 10
2 8 7
33
BuildHeap() Example
A = {4, 1, 10, 14, 16, 9, 3, 2, 8, 7}
1
4
i=2 3
1 10
4 5 6 7
14 16 9 3
8 9 10
2 8 7
34
BuildHeap() Example
A = {4, 16, 10, 14, 7, 9, 3, 2, 8, 1}
i=1
4
2 3
16 10
4 5 6 7
14 7 9 3
8 9 10
2 8 1
35
BuildHeap() Example
A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1
36
Analyzing BuildHeap()
Each call to Heapify() takes O(lg n) time
There are O(n) such calls (specifically, n/2)
Thus the running time is O(n lg n)
Is this a correct asymptotic upper bound?
YES
Is this an asymptotically tight bound?
NO
A tighter bound is O(n)
How can this be? Is there a flaw in the above reasoning?
We can derive a tighter bound by observing that the time for
Heapify to run at a node varies with the height of the node in
the tree, and the heights of most nodes are small.
Fact: an n-element heap has at most 2h-k nodes of
level k, where h is the height of the tree.
37
Analyzing BuildHeap(): Tight
The time required by Heapify on a node of height k is O(k).
So we can express the total cost of BuildHeap as
39
Heapsort
Heapsort(A)
{
1. Build-Heap(A)
2. for i length[A] downto 2
3. do exchange A[1] A[i]
4. heap-size[A] heap-size[A] - 1
5. Heapify(A, 1)
}
40
HeapSort() Example
A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1
41
HeapSort() Example
A = {14, 8, 10, 4, 7, 9, 3, 2, 1, 16}
1
14
2 3
8 10
4 5 6 7
4 7 9 3
8 9
2 1 16
i = 10
42
HeapSort() Example
A = {10, 8, 9, 4, 7, 1, 3, 2, 14, 16}
1
10
2 3
8 9
4 5 6 7
4 7 1 3
8
2 14 16
i=9 10
43
HeapSort() Example
A = {9, 8, 3, 4, 7, 1, 2, 10, 14, 16}
1
9
2 3
8 3
4 5 6 7
4 7 1 2
10 14 16
i=8 9 10
44
HeapSort() Example
A = {8, 7, 3, 4, 2, 1, 9, 10, 14, 16}
1
8
2 3
7 3
4 5 6
4 2 1 9
i=7
10 14 16
8 9 10
45
HeapSort() Example
A = {7, 4, 3, 1, 2, 8, 9, 10, 14, 16}
1
7
2 3
4 3
4 5
1 2 8 9
i=6 7
10 14 16
8 9 10
46
HeapSort() Example
A = {4, 2, 3, 1, 7, 8, 9, 10, 14, 16}
1
4
2 3
2 3
4 i=5
1 7 8 9
6 7
10 14 16
8 9 10
47
HeapSort() Example
A = {3, 2, 1, 4, 7, 8, 9, 10, 14, 16}
1
3
2 3
2 1
i=4 4 7 8 9
5 6 7
10 14 16
8 9 10
48
HeapSort() Example
A = {2, 1, 3, 4, 7, 8, 9, 10, 14, 16}
1
2
2 i=3
1 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10
49
HeapSort() Example
A = {1, 2, 3, 4, 7, 8, 9, 10, 14, 16}
1
1
i =2 3
2 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10
50
Analyzing Heapsort
The call to BuildHeap() takes O(n) time
51
Analyzing Heapsort
The O(n log n) run time of heap-sort is much better
than the O(n2) run time of selection and insertion
sort
52
Max-Priority Queues
A data structure for maintaining a set S of elements,
each with an associated value called a key.
Applications:
scheduling jobs on a shared computer
prioritizing events to be processed based on their predicted
time of occurrence.
Printer queue
53
Max-Priority Queue: Basic Operations
Maximum(S): return A[1]
returns the element of S with the largest key (value)
Extract-Max(S):
removes and returns the element of S with the largest key
Increase-Key(S, x, k):
increases the value of element x’s key to the new value k,
x.value k
Insert(S, x):
inserts the element x into the set S, i.e. S S {x}
54
Extract-Max(A)
1. if heap-size[A] < 1 // zero elements
2. then error “heap underflow”
3. max A[1] // max element in first position
4. A[1] A[heap-size[A]]
// value of last position assigned to first position
5. heap-size[A] heap-size[A] – 1
6. Heapify(A, 1)
7. return max
55
Increase-Key(A, i, key)
// increase a value (key) in the array
1. if key < A[i]
2. then error “new key is smaller than current
key”
3. A[i] key
4. while i > 1 and A[Parent(i)] < A[i]
5. do exchange A[i] A[Parent(i)]
6. i Parent(i) // move index up to parent
56
Increase-Key() Example
A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 i=9 10
2 4 1
57
Increase-Key() Example
A = {16, 14, 10, 8, 7, 9, 3, 2, 15, 1}
The index i increased to 15.
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 i=9 10
2 15 1
58
Increase-Key() Example
A = {16, 14, 10, 15, 7, 9, 3, 2, 8, 1}
After one iteration of the while loop of lines 4-
6, the node and its parent have exchanged
keys (values), and the index i moves up to
the parent. 1
16
2 3
14 10
i=4 5 6 7
15 7 9 3
8 9 10
2 8 1
59
Increase-Key() Example
A = {16, 15, 10, 14, 7, 9, 3, 2, 8, 1}
After one more iteration of the while loop.
The max-heap property now holds and the
procedure terminates.
1
16
2 3
15 10
i=4 5 6 7
14 7 9 3
8 9 10
2 8 1
60
Insert(A, key)
61
Example: Operation of Heap Insert
62
Running Time
Running time of Maximum is Θ(1)
Running time of Extract-Max is O(lg n).
Performs only a constant amount of work + time of
Heapify, which takes O(lg n) time
Running time of Increase-Key is O(lg n).
The path traced from the new leaf to the root has length
O(lg n).
Running time of Insert is O(lg n).
Performs only a constant amount of work + time of
Increase-Key, which takes O(lg n) time
In Summary, a heap can support any max-priority
queue operation on a set of size n in O(lg n) time.
63