Python
Programming
List Manipulation
Code Examples
Duplicate Question

Compute list difference

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Computing the difference between two lists sounds simple until duplicates become relevant. If one list contains repeated values, you have to decide whether you want mathematical set difference, order-preserving filtering, or multiset subtraction where each matching duplicate is removed only once.

Those are different problems, and they produce different answers. The correct approach depends on whether duplicate counts matter, whether original order matters, and how large the inputs are.

Use Set Difference When Uniqueness Is All You Need

If you only care about unique values, convert both lists to sets and subtract them.

python
1a = [1, 2, 2, 3, 4]
2b = [2, 4]
3
4result = list(set(a) - set(b))
5print(result)

This gives the distinct values from a that do not appear in b. It is concise and usually fast, but it throws away:

  • original order
  • duplicate counts

That makes it useful for membership-style problems, but not for cases where the list itself carries meaning through order or repetition.

Preserve Order While Removing All Matching Values

If you want to keep the order from the first list and remove every value that appears in the second list, use a lookup set and a comprehension.

python
1a = [1, 2, 2, 3, 4]
2b = [2, 4]
3
4b_lookup = set(b)
5result = [item for item in a if item not in b_lookup]
6
7print(result)

This prints:

python
[1, 3]

Every 2 is removed because 2 exists in b. That is often the intended meaning in application code, such as filtering out disabled feature flags or excluding blocked identifiers from an ordered result set.

Use Counter When Duplicate Counts Matter

Sometimes you want multiset subtraction. In other words, if b contains two copies of a value, only two copies should be removed from a.

python
1from collections import Counter
2
3a = [1, 2, 2, 2, 3, 4]
4b = [2, 2, 4]
5
6result_counter = Counter(a) - Counter(b)
7result = list(result_counter.elements())
8
9print(result)

Possible output:

python
[1, 2, 3]

One 2 remains because the first list had three copies and the second list had two. That is the key difference between duplicate-aware subtraction and simple filtering.

Preserve Both Order and Duplicate Counts

Counter handles counts well, but elements() does not preserve the left-to-right order from the original list. If you need both duplicate-aware subtraction and stable order, keep a removal counter and walk the first list manually.

python
1from collections import Counter
2
3def ordered_multiset_difference(a, b):
4    to_remove = Counter(b)
5    result = []
6
7    for item in a:
8        if to_remove[item] > 0:
9            to_remove[item] -= 1
10        else:
11            result.append(item)
12
13    return result
14
15a = [1, 2, 2, 2, 3, 4]
16b = [2, 2, 4]
17
18print(ordered_multiset_difference(a, b))

This returns:

python
[1, 2, 3]

Now you get the behavior many people really need:

  • remove only as many duplicates as the second list contains
  • keep the surviving elements in their original order

Think About Complexity and Semantics Together

The naive pattern below works but becomes slow on larger lists because membership checks against a plain list are linear:

python
result = [x for x in a if x not in b]

If order-preserving full filtering is the goal, converting b to a set usually gives a better time profile. If duplicate counts matter, Counter is cleaner and safer than repeatedly calling list.remove() inside a loop.

Performance matters, but semantics matter first. A fast solution that removes too many duplicates or scrambles order is still wrong.

Common Pitfalls

The most common mistake is using set subtraction when duplicates or order matter. Another is writing a list comprehension against a plain list for membership checks on large inputs, which is correct but slower than using a set lookup. Developers also confuse "remove every matching value" with "remove only one matching copy for each appearance in the second list," even though those are different operations.

It is also easy to forget that Counter solves counts, not stable order. If both requirements matter, you need one more pass through the original list.

Summary

  • Use set subtraction when you only care about unique values.
  • Use an order-preserving filter when every matching value should be removed.
  • Use Counter subtraction when duplicate counts matter.
  • Walk the first list with a removal counter if you need both duplicate-aware subtraction and original order.
  • Choose the algorithm based on semantics first, then optimize performance.

Course illustration
Course illustration

All Rights Reserved.