Python
Lists
Distinct Elements
Remove Duplicates
Programming

How to make lists contain only distinct element in Python?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Making a Python list contain only distinct elements is a common task, but the right solution depends on what “distinct” means for your data. If order does not matter, one approach is enough. If order matters, or if the list contains dictionaries or custom objects, you need a more careful strategy.

The Shortest Solution: Convert Through set

If your items are hashable and you do not care about preserving order, the shortest answer is:

python
values = [3, 1, 3, 2, 1, 5]
distinct = list(set(values))
print(distinct)

This works because sets keep only one copy of each hashable value. It is usually fast, but it comes with an important tradeoff: set conversion does not preserve the original list order in an application-level sense.

Use this approach only when order is irrelevant.

Preserve Order With dict.fromkeys

If you want the first occurrence of each item and you want to keep the original order, dict.fromkeys is a strong default in modern Python:

python
values = ["api", "db", "api", "cache", "db", "worker"]
distinct = list(dict.fromkeys(values))
print(distinct)

Output:

python
['api', 'db', 'cache', 'worker']

This is often the best one-line answer because it is concise, order-preserving, and efficient for hashable elements.

Use an Explicit Loop for Custom Rules

Sometimes duplicates are not exact textual duplicates. You may want case-insensitive matching, trimmed whitespace, or uniqueness by a normalized key. An explicit loop is clearer in those cases:

python
1values = ["A", "b", "a", "B", "c"]
2seen = set()
3distinct = []
4
5for item in values:
6    normalized = item.lower()
7    if normalized not in seen:
8        seen.add(normalized)
9        distinct.append(item)
10
11print(distinct)

This preserves the first original spelling while treating "A" and "a" as duplicates. That is something set(values) cannot express correctly on its own.

Handling Unhashable Elements

Lists of dictionaries or lists cannot be passed directly to set, because those values are unhashable.

python
records = [{"id": 1}, {"id": 1}]
# set(records) would raise TypeError

In that situation, define distinctness by a stable key:

python
1records = [
2    {"id": 1, "name": "Ada"},
3    {"id": 2, "name": "Grace"},
4    {"id": 1, "name": "Ada Lovelace"},
5]
6
7seen_ids = set()
8distinct = []
9
10for record in records:
11    if record["id"] not in seen_ids:
12        seen_ids.add(record["id"])
13        distinct.append(record)
14
15print(distinct)

Now uniqueness is based on id, which is often what you really want in application code.

Deduplicating While Building the Result

If you are constructing the list incrementally, it can be cleaner to avoid duplicates as you go:

python
1values = ["alice", "bob", "alice", "carol"]
2seen = set()
3result = []
4
5for value in values:
6    if value not in seen:
7        seen.add(value)
8        result.append(value)
9
10print(result)

This pattern scales better than repeatedly checking if value not in result, because set membership is much faster than scanning the result list every time.

Choosing the Right Method

A simple decision guide is:

  • use set when order does not matter
  • use dict.fromkeys when order matters and elements are hashable
  • use a loop with a custom key when uniqueness depends on normalization or object fields

The biggest mistake is treating deduplication as only a performance question. It is really a semantics question first.

Common Pitfalls

The most common pitfall is using set and then being surprised that the output order is not the same as the input order.

Another issue is assuming every list element is hashable. Dictionaries, lists, and many custom objects cannot be added to a set directly.

People also often deduplicate the raw value when the business rule really depends on a transformed value, such as lowercase email addresses or trimmed user names.

Finally, avoid this pattern for large inputs:

python
1distinct = []
2for item in values:
3    if item not in distinct:
4        distinct.append(item)

It works, but membership checks against a growing list become slow as the input gets larger.

Summary

  • 'list(set(values)) removes duplicates quickly when order does not matter.'
  • 'list(dict.fromkeys(values)) is a clean order-preserving solution for hashable items.'
  • Use an explicit loop when you need custom uniqueness rules.
  • For unhashable objects, deduplicate by a stable key rather than by the object itself.
  • Pick the method based on meaning first, then performance.

Course illustration
Course illustration

All Rights Reserved.