Subset Sum algorithm

Subset Sum

Algorithm

Computer Science

Dynamic Programming

NP-Complete

Subset Sum algorithm

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

The subset sum problem asks whether some subset of a set of integers adds up to a target value. It is a classic problem because the brute-force solution is simple but expensive, while dynamic programming gives a practical exact solution when the target sum is not too large.

Problem Definition

Given a list of integers and a target, determine whether there exists a subset whose sum equals that target. For example, in [3, 34, 4, 12, 5, 2] with target 9, the answer is true because 4 + 5 = 9.

This is not the same as choosing a contiguous range. Elements can come from any positions, but each element can be used at most once.

Brute Force Baseline

The direct approach explores every subset. For n values there are 2^n subsets, so the runtime grows exponentially.

python

1def subset_sum_bruteforce(nums, target, i=0):
2    if target == 0:
3        return True
4    if i == len(nums):
5        return False
6
7    return (
8        subset_sum_bruteforce(nums, target - nums[i], i + 1)
9        or subset_sum_bruteforce(nums, target, i + 1)
10    )
11
12
13print(subset_sum_bruteforce([3, 34, 4, 12, 5, 2], 9))

This is useful for explaining the search tree, but it becomes impractical quickly.

Dynamic Programming Idea

For non-negative integers, define dp[s] to mean "it is possible to make sum s using some prefix of the input processed so far." Start with only sum 0 being reachable.

Each new number updates the set of reachable sums. To avoid reusing the same number multiple times, update the sums in descending order.

python

1def subset_sum(nums, target):
2    dp = [False] * (target + 1)
3    dp[0] = True
4
5    for num in nums:
6        for s in range(target, num - 1, -1):
7            if dp[s - num]:
8                dp[s] = True
9
10    return dp[target]
11
12
13print(subset_sum([3, 34, 4, 12, 5, 2], 9))
14print(subset_sum([3, 34, 4, 12, 5, 2], 30))

This runs in O(n * target) time and O(target) space. That is still expensive if the target is huge, but it is far better than enumerating every subset in many practical cases.

Reconstruct A Valid Subset

Sometimes you need the actual subset, not only a boolean answer. One simple method is to keep predecessor information while filling the table.

python

1def subset_sum_with_solution(nums, target):
2    reachable = {0: []}
3
4    for num in nums:
5        updates = {}
6        for current_sum, subset in reachable.items():
7            new_sum = current_sum + num
8            if new_sum <= target and new_sum not in reachable:
9                updates[new_sum] = subset + [num]
10        reachable.update(updates)
11
12    return reachable.get(target)
13
14
15print(subset_sum_with_solution([3, 34, 4, 12, 5, 2], 9))

This version is easy to read and good for moderate targets. It trades some extra memory for simpler reconstruction.

Why The Problem Is Famous

Subset sum is NP-complete in the general case, which means no polynomial-time algorithm is known in terms of both input length and numeric magnitude. The dynamic programming solution is called pseudo-polynomial because it depends on the target value itself, not only on the number of bits needed to represent it.

That distinction matters. A target of 10,000,000 can make the DP table too large even for a modest number of input values.

Common Pitfalls

The most common mistake is iterating the DP array from low to high. That accidentally allows the same value to be reused multiple times in one pass, turning the algorithm into a different problem.

Another mistake is applying the standard table directly when negative numbers are allowed. Negative values require offsetting or a different state representation because reachable sums are no longer limited to 0 through target.

A third issue is using the exact algorithm when the target is extremely large. At that point, meet-in-the-middle or approximation strategies may be more appropriate.

Summary

Subset sum asks whether any subset adds up to a target.
Brute force is exponential because it explores all subsets.
Dynamic programming reduces the exact solution to O(n * target) time for non-negative integers.
Update the DP array from high to low so each number is used at most once.
Use a reconstruction strategy if you need the actual subset, not just a yes or no answer.