Union-Find (Disjoint Sets)

Topics Covered

Union-Find Concept

Disjoint Sets and Representatives

The Two Operations

The Naive Performance Problem

When to Reach for Union-Find

Path Compression and Union by Rank

Path Compression

Union by Rank

Walk-Through: Building a Union-Find Step by Step

Union by Size (Alternative)

The Inverse Ackermann Bound

Connected Components

Number of Provinces

Component Count Tracking Pattern

Union-Find vs DFS/BFS for Components

Practice

Advanced Union-Find Applications

Redundant Connection (Cycle Detection)

Accounts Merge

When Union-Find Outshines Graph Traversal

Practice

Many problems boil down to a single question: are these two elements in the same group? Social networks ask whether two users are connected. Network engineers ask whether two servers can reach each other. Game developers ask whether two tiles belong to the same island. The Union-Find data structure (also called Disjoint Set Union, or DSU) answers this question in nearly constant time after a lightweight setup phase.

Disjoint Sets and Representatives

A disjoint set collection partitions elements into non-overlapping groups. Each group has exactly one representative, called the root. Two elements belong to the same group if and only if they share the same root. Initially, every element is its own group with itself as the root.

The entire structure is stored in a single array called parent. For an element i, parent[i] holds the index of its parent in the tree. A root points to itself: parent[root] = root. This is all the state you need - no linked lists, no hash maps, just one integer array.

The Two Operations

Find(x) returns the root of the group containing x. Starting at x, follow parent pointers until you reach an element that points to itself. That element is the root. Two elements a and b are in the same group if find(a) == find(b).

Union(x, y) merges the groups containing x and y. Find the roots of both elements. If they are already the same root, the elements are already connected - do nothing. Otherwise, make one root point to the other.

Union-Find merge
Group A012Group B345Two separate groupsparent: [0,0,0, 3,3,3]
Union connects two disjoint trees by making one root point to the other, merging the groups in one step.

The Naive Performance Problem

Without any optimizations, the tree can degenerate into a long chain. Imagine performing union(0,1), union(1,2), union(2,3), and so on - each time attaching the new element at the end. The result is a chain of length n, and find must traverse all n nodes to reach the root. Both find and union become O(n) in the worst case.

This is not hypothetical - it happens whenever unions follow a sequential pattern. The naive structure is no better than a linked list. The next section introduces two optimizations that collapse this worst case down to nearly O(1).

Interview Tip

Union-Find shines in problems where relationships are added incrementally and you need to answer connectivity queries at any point. If you see 'are these two elements connected' or 'count the number of distinct groups', Union-Find should be your first instinct.

When to Reach for Union-Find

Union-Find is the right tool when you need to:

  • Track connected components as edges are added one by one
  • Determine if adding an edge creates a cycle
  • Count the number of distinct groups after a series of merge operations
  • Group elements that share a transitive relationship (if A relates to B, and B relates to C, then A, B, and C are in the same group)

Problems that require splitting groups (removing edges) are a poor fit for basic Union-Find because the union operation is not reversible. If you need both merges and splits, consider other approaches like DFS or link-cut trees.