Compute list difference

Python

Programming

List Manipulation

Code Examples

Duplicate Question

Compute list difference

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Computing the difference between two lists sounds simple until duplicates become relevant. If one list contains repeated values, you have to decide whether you want mathematical set difference, order-preserving filtering, or multiset subtraction where each matching duplicate is removed only once.

Those are different problems, and they produce different answers. The correct approach depends on whether duplicate counts matter, whether original order matters, and how large the inputs are.

Use Set Difference When Uniqueness Is All You Need

If you only care about unique values, convert both lists to sets and subtract them.

python

1a = [1, 2, 2, 3, 4]
2b = [2, 4]
3
4result = list(set(a) - set(b))
5print(result)

This gives the distinct values from a that do not appear in b. It is concise and usually fast, but it throws away:

original order
duplicate counts

That makes it useful for membership-style problems, but not for cases where the list itself carries meaning through order or repetition.

Preserve Order While Removing All Matching Values

If you want to keep the order from the first list and remove every value that appears in the second list, use a lookup set and a comprehension.

python

1a = [1, 2, 2, 3, 4]
2b = [2, 4]
3
4b_lookup = set(b)
5result = [item for item in a if item not in b_lookup]
6
7print(result)

This prints:

python

[1, 3]

Every 2 is removed because 2 exists in b. That is often the intended meaning in application code, such as filtering out disabled feature flags or excluding blocked identifiers from an ordered result set.

Use `Counter` When Duplicate Counts Matter

Sometimes you want multiset subtraction. In other words, if b contains two copies of a value, only two copies should be removed from a.

python

1from collections import Counter
2
3a = [1, 2, 2, 2, 3, 4]
4b = [2, 2, 4]
5
6result_counter = Counter(a) - Counter(b)
7result = list(result_counter.elements())
8
9print(result)

Possible output:

python

[1, 2, 3]

One 2 remains because the first list had three copies and the second list had two. That is the key difference between duplicate-aware subtraction and simple filtering.

Preserve Both Order and Duplicate Counts

Counter handles counts well, but elements() does not preserve the left-to-right order from the original list. If you need both duplicate-aware subtraction and stable order, keep a removal counter and walk the first list manually.

python

1from collections import Counter
2
3def ordered_multiset_difference(a, b):
4    to_remove = Counter(b)
5    result = []
6
7    for item in a:
8        if to_remove[item] > 0:
9            to_remove[item] -= 1
10        else:
11            result.append(item)
12
13    return result
14
15a = [1, 2, 2, 2, 3, 4]
16b = [2, 2, 4]
17
18print(ordered_multiset_difference(a, b))

This returns:

python

[1, 2, 3]

Now you get the behavior many people really need:

remove only as many duplicates as the second list contains
keep the surviving elements in their original order

Think About Complexity and Semantics Together

The naive pattern below works but becomes slow on larger lists because membership checks against a plain list are linear:

python

result = [x for x in a if x not in b]

If order-preserving full filtering is the goal, converting b to a set usually gives a better time profile. If duplicate counts matter, Counter is cleaner and safer than repeatedly calling list.remove() inside a loop.

Performance matters, but semantics matter first. A fast solution that removes too many duplicates or scrambles order is still wrong.

Common Pitfalls

The most common mistake is using set subtraction when duplicates or order matter. Another is writing a list comprehension against a plain list for membership checks on large inputs, which is correct but slower than using a set lookup. Developers also confuse "remove every matching value" with "remove only one matching copy for each appearance in the second list," even though those are different operations.

It is also easy to forget that Counter solves counts, not stable order. If both requirements matter, you need one more pass through the original list.

Summary

Use set subtraction when you only care about unique values.
Use an order-preserving filter when every matching value should be removed.
Use Counter subtraction when duplicate counts matter.
Walk the first list with a removal counter if you need both duplicate-aware subtraction and original order.
Choose the algorithm based on semantics first, then optimize performance.

Compute list difference

Master System Design with Codemia

Introduction

Use Set Difference When Uniqueness Is All You Need

Preserve Order While Removing All Matching Values

Use Counter When Duplicate Counts Matter

Preserve Both Order and Duplicate Counts

Think About Complexity and Semantics Together

Common Pitfalls

Summary

Use `Counter` When Duplicate Counts Matter