Union-Find and Amortized Analysis
Union-Find and Amortized Analysis
Union-Find and Amortized Analysis
1 Introduction
In this set of notes, we study the union-find data structure (which is needed to implement Kruskal’s algorithm
efficiently). 1
• FIND(x): return Si such that x ∈ Si (in an actual implementation, we would likely just return the
representative element for set Si ).
So for Kruskal’s algorithm, we initialize a union-find data structure over the vertices. For each edge
e = (u, v), if FIND(u) 6=FIND(v), then we call UNION(FIND(u), FIND(v)) to merge u and v’s components
(otherwise, we move on to the next edge). In total we will issue at most 2m FIND queries and always
perform n UNION operations.
1 Most materials are from a previous note by Nat Kell and Ang Li for this class in Fall 2014.
9-1
2.2 Implementing Union-Find
We now turn to the details of implementing FIND(x) and UNION(Si , S j ) efficiently. Our first implementa-
tion decision is to how to represent the “labels” for each set. Here, we will use elements as representatives:
At any given time, there will be a unique x ∈ Si which we will return as the label of Si whenever we call
FIND (y) for any y ∈ Si (in the following implementations, we will make it clear how each representative is
determined/maintained).
{x1 } ∪ {x2 }
{x3 } ∪ {x1 , x2 }
{x4 } ∪ {x1 , x2 , x3 }
..
.
{xn } ∪ {x1 , . . . , xn−1 }.
Informally, we grow one particular set in the set system, and then with each UNION we add one of the
remaining singleton sets to this growing set. When we perform U NION(Si , S j ) in this scheme, note that we
are arbitrarily picking which root (the root of Ti or the root T j ) becomes the new root when we combine Ti
and T j . Thus in the above example, it is possible that when we merge S = {xi } with S0 = {x1 , . . . , xi−1 },
9-2
we use xi as the new root each time. If we are unfortunate enough to have this sequence of events happen
for each union, then the resulting tree structure will just be an n element linked list (and therefore it is still
possible for F IND(x) to take Ω(n) time).
A straight forward way to fix this pitfall is to do what is called union-by-depth. For each tree Ti , we
keep track of its depth di , or the longest path from the root to any node in the tree. Now when we perform
a UNION, we check to see which tree has the larger depth and then use the root of this tree as the new
root. Note that this extra information can be easily stored and updated with the root of each tree: If we call
U NION(Si , S j ) and di ≤ d j , then the root of T j becomes to root of Ti ∪ T j , and we update the depth of Ti ∪ T j
to be max(d j , di + 1) (note this max is only necessary in the case where di = d j —otherwise, the depth of the
combined tree is no larger than depth of T j ).
What does “union-by-depth” buy us? The following theorem establishes that this feature does indeed
balance the trees in the set system.
Theorem 1. For a tree implementation of the union-find data structure that uses union-by-depth, any tree
T (representing set Si in the set system) with depth d contains at least 2d elements.
Proof. We do a proof by induction on the tree depth d. Since a tree T with depth 0 has has 20 = 1 elements,
the base case is trivial. For the inductive step, assume that the hypothesis holds for all trees with depth k − 1,
i.e., any tree with depth k − 1 contains at least 2k−1 nodes. Observe that in order to build a tree T with depth
k, we must merge together two trees Ti and T j that both have depth k − 1; otherwise, we would either have:
1. Both Ti and T j have depth strictly less than k −1. Since the depth of Ti ∪T j can be no more max(d j , di )+
1, the combined tree Ti ∪ T j can have depth at most k − 1 (note this is true regardless of whether we
use union-by-depth).
2. Exactly one tree has depth k − 1; without loss of generality, suppose d j = k − 1 and di < k − 1. Since
we are using union-by-depth, we will make the root of Ti ∪ T j the root of T j . Since di < k − 1, the
length of any path from this new root of to any node in Ti can be at most k − 1. Since T j has depth
k − 1 and no node in within this subtree changes depth in Ti ∪ T j , the depth of the combined tree is
exactly k − 1.
Therefore, assume di = d j = k − 1; we can then apply our inductive hypothesis to both Ti and T j to
obtain:
|T | = |Ti ∪ T j | = |Ti | + |T j | ≥ 2k−1 + 2k−1 = 2k ,
as desired.
It follows that any tree with n elements can have depth at most log n (the theorem implies n ≥ 2d where
d is the depth of the largest tree/subset, implying log n ≥ d). Therefore, FIND(x) runs in O(log n) when using
union-by-depth. From Kruskal’s perspective, this gives us the desired running time. The initial sort we do
on the edge weights takes O(m log m) = O(m log n2 ) = O(m log n) time. We then do n UNIONs that each
take O(1) time and 2m FINDs that each take O(log n) time. Therefore, the overall running time of Kruskal’s
using this implementation is O(m log n) + O(n) + O(m log n) = O(m log n).
9-3
2.2.3 Union-Find with Stars
Although doing a tree implementation that uses union-by-depth gave us the desired asymptotic running
time of O(m log n), it is a bit unsettling that UNIONs take constant time and FINDs could take Ω(log n)
time. Since n = O(m) for any graph where we want to find a spanning tree, it seems a bit wasteful that
our implementation gives us a faster running time for the function we call fewer times (recall we perform
n UNIONs and at most 2m FINDs). Therefore in this section, we will look at an implementation where we
force each FIND to take O(1) time, but as a result make UNIONs operations more expensive (but hopefully
by not by too much).
The most naive way to achieve O(1)-time FINDs is to represent sets as star graphs. A star graph is
simply a tree with a designated a center node such that every other node in the graph is a leaf that is only
adjacent (or points) to this center node. Thus, we will maintain that each tree Ti that represents a set Si is a
star graph, where the center node of Ti is the representative of Si . Clearly with this scheme, when we call
FIND (x) we must only traverse at most 1 link to reach the representative node, and therefore the running
time of FIND(x) is O(1).
However to maintain this star graph structure, we will need to take more time when we make a UNION
call. If we have two star graphs Ti and T j that we want to merge, we first need to pick which representative
element we will use for Ti ∪ T j (just like for our previous implementation with balanced trees). If we pick
Ti ’s center ci to be the new center, we then need to iterate through every element x ∈ T j and make x point to
ci . Since T j could have Ω(n) elements, this operation could take Ω(n) time. Therefore if we do n UNION
operations, our running time for Kruskal’s is now Θ(n2 ) (which could be worse than O(m log n)).
To avoid this problem, we will use a rule that is similar to union-by-depth. Namely, we will use union-
by-size. Namely, if we are given two star graphs T j and Ti , we will dissemble the smaller of the two sets
and make these elements point to the center of the larger set (and leave the star graph in the larger graph
untouched).
To analyze the speedup obtained from doing union-by-size, we use a charging argument to do an amor-
tized analysis over the n UNIONs performed by Kruskal’s. We use the folioing charging scheme: Any time
we merge two trees Ts and T` such that |Ts | ≤ |T` |, we will simply put a unit of charge on each element in Ts
(remember that we are taking the elements of Ts and changing their pointers to the center of T` ). Note that
for all x ∈ Ts , x now belongs to a set that is twice as large. We also know that for x ∈ U, the set to which x
belongs can double at most log n times (the size of final merged set is n); therefore, the charge on a given
element x can be at most log n after n unions. Since the total time needed over all n unions is equal to the
total charge distributed over the elements, the time it takes to make n UNION calls is O(n log n). Note that
even though Kruskal’s algorithm still runs in O(m log n) time since we must initially sort the edges, we have
reduced the time it takes to execute Kruskal’s merging procedure to O(n log n + m).
9-4
modification, and therefore after the procedure completes, ri and all the elements along P now form a star
graph in Ti . Note that it is not too hard to implement F IND(x) such that it returns ri , makes every element in
P point directly to ri , and runs in O(|P|) time.
To implement U NION, we essentially still use union-by-depth. We still merge components using the
same rule (we make the tree with the smaller depth a subtree of the tree with the larger depth). Note,
however, that because we might have path compressing F IND calls in-between UNION calls, we might
compress a path that defined the depth a given tree Ti . In such a case, di no longer accurately stores the
depth of di .
How does one fix this issue? The answer is that we do not. Instead, we just call this di the rank of Ti and
use it in the same way we would in our union-by-depth scheme. It turns out that using these two features in
combination gives us an extremely good bound. We define
We give 2k dollars to a node whose rank is in {k + 1, k + 2, ..., 2k } and there are at most 2nk nodes whose
rank is in that group, therefore we give at most O(n) dollars totally to a group. There are O(log∗ n) groups
and we totally give O(n log∗ n) dollars.
9-5