Graph Theory
Algorithm Design
Independent Set
Tree Structures
Computational Complexity

algorithm to find max independent set in a tree

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Finding a maximum independent set is hard on general graphs, but trees are a much friendlier case. Because a tree has no cycles, you can solve the problem in linear time with dynamic programming. The core idea is to decide, for each node, what the best answer is when that node is included and when it is excluded.

Define the Two States Per Node

Root the tree anywhere, often at node 0. For each node u, compute two values:

  • 'include[u]: best size if u is in the independent set'
  • 'exclude[u]: best size if u is not in the independent set'

The recurrence is simple:

  • if u is included, none of its children may be included
  • if u is excluded, each child may be included or excluded, whichever is better

That gives:

  • 'include[u] = 1 + sum(exclude[child])'
  • 'exclude[u] = sum(max(include[child], exclude[child]))'

A Depth-First Search Computes the Answer

Because every subtree is independent once the parent decision is fixed, a post-order DFS is enough:

python
1def max_independent_set(tree, root=0):
2    n = len(tree)
3    include = [0] * n
4    exclude = [0] * n
5
6    def dfs(node, parent):
7        include[node] = 1
8        exclude[node] = 0
9
10        for child in tree[node]:
11            if child == parent:
12                continue
13            dfs(child, node)
14            include[node] += exclude[child]
15            exclude[node] += max(include[child], exclude[child])
16
17    dfs(root, -1)
18    return max(include[root], exclude[root])
19
20
21tree = [
22    [1, 2],
23    [0, 3, 4],
24    [0, 5],
25    [1],
26    [1],
27    [2],
28]
29
30print(max_independent_set(tree))

This program returns the size of the maximum independent set for the tree represented as an adjacency list.

Why the Recurrence Works

Suppose you include a node u. Then every edge from u to a child forbids that child from also being chosen. So the best each child subtree can contribute is exclude[child].

If you exclude u, each child subtree becomes independent of the others. For each child you just take whichever option is larger:

  • include the child
  • exclude the child

That local optimal choice is valid because trees do not have cross-links between sibling subtrees.

Recovering the Actual Set, Not Just the Size

Sometimes you need the vertices, not only the count. After computing include and exclude, perform a second traversal that reconstructs the choices:

  • if the parent was included, the current node must be excluded
  • otherwise choose the larger of include[node] and exclude[node]

That adds another linear pass, so the overall complexity remains O(n).

Example Intuition

Consider this tree:

text
1    0
2   / \
3  1   2
4 / \   \
53   4   5

A good independent set is 0, 3, 4, 5, which has size 4. Notice how including node 0 forces 1 and 2 out, but then their children become available again. The dynamic programming recurrence captures exactly that tradeoff.

Complexity

Each edge is visited a constant number of times during DFS, so the running time is O(n). The memory cost is also O(n) for the adjacency list and the two DP arrays.

That is a major improvement over the general-graph version of the problem, which is computationally hard.

Common Pitfalls

  • Applying this tree DP directly to a graph with cycles.
  • Forgetting to pass the parent in DFS and accidentally recursing back upward forever.
  • Mixing up the meaning of the include and exclude states.
  • Returning only include[root] instead of max(include[root], exclude[root]).
  • Computing the size correctly but forgetting that reconstructing the chosen nodes needs a second pass.

Summary

  • Maximum independent set on a tree is solvable in linear time.
  • Use two DP states per node: include it or exclude it.
  • A post-order DFS computes both states bottom up.
  • The final answer is max(include[root], exclude[root]).
  • A second traversal can reconstruct the actual chosen vertices if needed.

Course illustration
Course illustration

All Rights Reserved.