algorithm to find max independent set in a tree
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Finding a maximum independent set is hard on general graphs, but trees are a much friendlier case. Because a tree has no cycles, you can solve the problem in linear time with dynamic programming. The core idea is to decide, for each node, what the best answer is when that node is included and when it is excluded.
Define the Two States Per Node
Root the tree anywhere, often at node 0. For each node u, compute two values:
- '
include[u]: best size ifuis in the independent set' - '
exclude[u]: best size ifuis not in the independent set'
The recurrence is simple:
- if
uis included, none of its children may be included - if
uis excluded, each child may be included or excluded, whichever is better
That gives:
- '
include[u] = 1 + sum(exclude[child])' - '
exclude[u] = sum(max(include[child], exclude[child]))'
A Depth-First Search Computes the Answer
Because every subtree is independent once the parent decision is fixed, a post-order DFS is enough:
This program returns the size of the maximum independent set for the tree represented as an adjacency list.
Why the Recurrence Works
Suppose you include a node u. Then every edge from u to a child forbids that child from also being chosen. So the best each child subtree can contribute is exclude[child].
If you exclude u, each child subtree becomes independent of the others. For each child you just take whichever option is larger:
- include the child
- exclude the child
That local optimal choice is valid because trees do not have cross-links between sibling subtrees.
Recovering the Actual Set, Not Just the Size
Sometimes you need the vertices, not only the count. After computing include and exclude, perform a second traversal that reconstructs the choices:
- if the parent was included, the current node must be excluded
- otherwise choose the larger of
include[node]andexclude[node]
That adds another linear pass, so the overall complexity remains O(n).
Example Intuition
Consider this tree:
A good independent set is 0, 3, 4, 5, which has size 4. Notice how including node 0 forces 1 and 2 out, but then their children become available again. The dynamic programming recurrence captures exactly that tradeoff.
Complexity
Each edge is visited a constant number of times during DFS, so the running time is O(n). The memory cost is also O(n) for the adjacency list and the two DP arrays.
That is a major improvement over the general-graph version of the problem, which is computationally hard.
Common Pitfalls
- Applying this tree DP directly to a graph with cycles.
- Forgetting to pass the parent in DFS and accidentally recursing back upward forever.
- Mixing up the meaning of the include and exclude states.
- Returning only
include[root]instead ofmax(include[root], exclude[root]). - Computing the size correctly but forgetting that reconstructing the chosen nodes needs a second pass.
Summary
- Maximum independent set on a tree is solvable in linear time.
- Use two DP states per node: include it or exclude it.
- A post-order DFS computes both states bottom up.
- The final answer is
max(include[root], exclude[root]). - A second traversal can reconstruct the actual chosen vertices if needed.

