Lecture 19
Lecture 19
Lecture 19
Treaps
Lecture 19
1. AVL trees. Binary search trees in which the two children of each node differ in height by at
most 1.
2. Red-Black trees. Binary search trees with a somewhat looser height balance criteria — no
path is more than twice as long as any path.
3. 2–3 and 2–3–4 trees. Trees with perfect height balance but the nodes can have different
number of children so they might not be weight balanced. These are isomorphic to red-black
trees by grouping each black node with its red children, if any.
1
4. B-trees. A generalization of 2–3–4 trees that allow for a large branching factor, sometimes
up to 1000s of children. Due to their large branching factor, they are well-suited for storing
data on disks.
5. Splay trees. Binary search trees that are only balanced in the amortized sense (i.e. on average
across multiple operations).
6. Weight balanced trees. Trees in which the children all have the same size. These are most
typically binary, but can also have other branching factors.
7. Treaps. A binary search tree that uses random priorities associated with every element to
keep balance.
8. Random search trees. A variant on treaps in which priorities are not used, but random
decisions are made with probabilities based on tree sizes.
9. Skip trees. A randomized search tree in which nodes are promoted to higher levels based
on flipping coins. These are related to skip lists, which are not technically trees but are also
used as a search structure.
Traditional treatments of binary search trees concentrate on three operations: search, insert
and delete. Out of these, search is naturally parallel since any number of searches can proceed
in parallel with no conflicts1 However, insert and delete are inherently sequential, as normally
described. For this reason, in later lectures, we’ll discuss union and difference, which are more
general operations that are useful for parallel updates and of which insert and delete are just a
special case.
2 Treaps
A Treap (tree + heap) is a randomized BST that maintains balance in a probabilistic way. In the
next couple lectures, we will show that with high probability, a treap with n keys will have depth
O(log n). In a Treap, a random priority is assigned to every key. In practice, this is often done
by performing a hash on a key, or by assigning a different random number to each key. We will
assume that each priority is unique although it is possible to remove this assumption.
The nodes in a treap must satisfy two properties:
• BST Property. Their keys satisfy the BST property (i.e., keys are stored in-order in the tree).
1
In splay trees and other self-adjusting trees, this is not true since a searches can modify the tree.
2
• Heap Property. The associated priority satisfy the heap property. The (max) heap property
requires for every node that the priority at a node is greater than the priority of its two
children.
(b,9)
(a,3) (e,6)
(c,2) (f,5)
Theorem 1 For any set S of key-value pairs, there is exactly one treap T containing the key-value
pairs in S which satisfies the treap properties.
Proof. The key k with the highest priority in S must be the root node, since otherwise the tree
would not be in heap order. Then, to satisfy the property that the treap is ordered with respect to
the nodes’ keys, all keys in S less than k must be in the left subtree, and all keys greater than k
must be in the right subtree. Inductively, the two subtrees of k must be constructed in the same
manner.
Now, if you are given a set of elements with keys and priorities, how would you build a treap
out of them? Take the element with the largest priority, make that the root. Then partition around
that root based on keys to find the elements in the right and left subtrees and recurse. What does
this remind you of?
This procedure is exactly the procedure you do for quicksort. There is a straightforward rela-
tionship between the analysis of quicksort and the height of a treap.
3
For the analysis, we’ll consider a completely equivalent algorithm which will be slightly easier
to analyze. Before the start of the algorithm, we’ll pick for each element a random priority uni-
formly at random from the real interval [0, 1]. At each step, instead of picking a pivot randomly,
we’ll instead pick the key with the highest priority (sound familiar?). Notice that once the priori-
ties are decided, the algorithm is completely deterministic; you should convince yourself that the
two presentations of the algorithm are fully equivalent (modulo the technical details about how we
might store the priority values).
We’re interested in counting how many comparisons QUICK S ORT makes. This immediately
bounds the work for the algorithm because this is where the bulk of work is done. That is, if we let
our goal is to find an upper bound on E [Xn ] for any input sequence S. For this, we’ll consider
the final sorted order2 of the keys T = SORT(S). In this terminology, we’ll also denote by pi the
priority we chose for the element Ti .
We’ll derive an expression for Xn by breaking it up into a bunch of random variables and bound
them. Consider two positions i, j ∈ {1, . . . , n} in the sequence T . We use the random indicator
variables Aij to indicate whether we compare the elements Ti and Tj during the algorithm—i.e.,
the variable will take on the value 1 if they are compared and 0 otherwise.
Looking closely at the algorithm, we have that if some two elements are compared, one of
them has to be a pivot in that call. So, then the other element will be either in the left partition or
the right partition, but the pivot won’t be the part of any partition. Therefore, once an element is a
pivot, we never compare it to anything ever again. This gives the following observation:
Observation 1 In the quicksort algorithm, if some two elements are compared in a QUICK S ORT
call, they will never be compared again in other call.
Therefore, with these random variables, we can express the total comparison count Xn as
follows:
Xn−1 Xn
Xn ≤ Ai,j
i=1 j=i+1
Pn−1 Pn
By linearity of expectation, we have E [Xn ] ≤ i=1 j=i+1 E [Ai,j ]. Furthermore, since each
Ai,j is an indicator random variable, E [Ai,j ] = Pr {Ai,j = 1}. Our task therefore comes down to
computing the probability that Ti and Tj are compared (i.e., Pr {Ai,j = 1}) and working out the
sum.
Computing the probability Pr {Ai,j = 1}. The crux of the matter is in describing the event
Ai,j = 1 in terms of a simple event that we have a handle on. Before we prove any concrete result,
2
Formally, there’s a permutation π : {1, . . . , n} → {1, . . . , n} between the positions of S and T .
4
let’s take a closer look at the quicksort algorithm to gather some intuitions. Notice that the top
level takes as its pivot p the element with highest priority. Then, it splits the sequence into two
parts, one with keys larger than p and the other with keys smaller than p. For each of these parts,
we run QUICK S ORT recursively; therefore, inside it, the algorithm will pick the highest priority
element as the pivot, which is then used to split the sequence further. With this view, the following
observation is not hard to see:
Claim 2 For i < j, Ti and Tj are compared if and only if pi or pj has the highest priority among
{pi , pi+1 , . . . , pj }.
Proof. We’ll show this by contradiction. Assume there is a key Tk , i < k < j with a higher
priority between them. In any collection of keys that include Ti and Tj , Tk will become a pivot
before either of them. Since Tk “sits” between Ti and Tk (i.e., Ti ≤ Tk ≤ Tj ) , it will separate Ti
and Tj into different buckets, so they are never compared.
Therefore, for Ti and Tj to be compared, pi or pj has to be bigger than all the priorities in
between. Since there are j − i + 1 possible keys in between (including both i and j) and each has
equal probability of being the highest, the probability that either i or j is the greatest is 2/(j −i+1).
Therefore,
E [Ai,j ] = Pr {Ai,j = 1}
= Pr {pi or pj is the maximum among {pi , . . . , pj }}
2
= .
j−i+1
Hence, the expected number of comparisons made is
n−1 X
X n
E [Xn ] ≤ E [Ai,j ]
i=1 j=i+1
n−1 X
n
X 2
=
i=1 j=i+1
j−i+1
n−1 X
n−i
X 2
= (k = j − i)
i=1 k=1
k+1
n−1 n−i
XX 2
<
i=1 k=1
k
n−1
X
< 2Hn
i=1
= O(n lg n)
5
4 Back to Treaps
How does this help with our analysis of treaps? Let us assume that you are given a set of elements
with randomly generated priorities and you create a treap out of them. What is the expected search
cost? The expected search cost is the expected depth of a node in the tree. Let us analyze that.
Consider a set of keys K and associated priorities p : key → int. We assume the priorities are
unique. Consider the keys laid out in order (i.e., by taking an inorder traversal of the Treap), and
as with the analysis of quicksort we use Ti and Tj to refer to two positions in the sorted order.
If we calculate the depth starting with zero at the root, the expected depth of a key is equivalent
to the number of ancestors in the tree. So we want to know how many ancestor. We use the
random indicator variable Bij to indicate that Tj is an ancestor of Ti . Now the expected depth can
be written as: n
X
E [depth of i in T ] = E [Bij ] .
j=1,j6=i
To analyze Bij lets just consider the |j −i|+1 keys and associated priorities from Ti to Tj inclusive
of both ends. (Note that keys outside of this range is irrelevant, because they don’t effect the
probability of Bij .) We consider three cases:
2. One of the elements Tk in the middle has the highest priority (i.e., neither Ti nor Tj .
6
Now we have
n
X 1
E [depth of i in T ] =
j=1,j6=i
|j − i| + 1
i−1 n
X 1 X 1
= +
j=1
i − j + 1 j=i+1 j − i + 1
1 1 1 1 1 1
= + + ... + + + + ... +
i i−1 2 2 3 n−i+1
= Hi + Hn−i+1
= O(log n)
Exercise 1 Including constant factors how does the expected depth for the first key compare to the
expected depth of the middle (i = n/2) key.
Is this similar to the bound you got for red-black trees? What is the red-black tree guarantee?
It is that the height of the tree is O(lg n). The answer is no. The fact that the expected depth of
a key (and hence its expected search cost) is O(lg n) does not imply that the expected height of
the tree is O(lg n). To see why, think about a family of trees where every key has depth O(lg n)
√
expect for one key which has depth of n. For this family of trees, the expected depth of a
√
node is O(lg n), but the expected height of a tree is n. Or, putting it differently — the height
of the tree depends on the depth of the key that has maximum depth. Even though we have the
expected depth of a key, that doesn’t tell us the expected value of the height of the tree. Recall that
E [max depth across all keys] is not the same as max{E [depthof key1] , E [depthof key2] , ...}.
It turns out that in order to argue that the height of a tree is O(lg n), we need a high-probability
bound, which we will study in the next lecture.