Python
JSON
Serialization
Sets
Programming

How to JSON serialize sets?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

JSON has arrays, objects, strings, numbers, booleans, and null, but it does not have a native set type. That is why Python’s json module raises an error when you try to serialize a set directly. The usual fix is to convert the set into a JSON-compatible structure first, most often a list.

Why Sets Fail with json.dumps

Python sets are unordered collections of unique values. The standard JSON encoder does not know how to turn them into one of JSON’s supported types.

python
1import json
2
3data = {"tags": {"python", "json", "api"}}
4
5try:
6    print(json.dumps(data))
7except TypeError as exc:
8    print(exc)

The error is expected: a set is not JSON serializable by default.

Simplest Fix: Convert the Set to a List

If you only need to emit JSON and do not care about preserving the type on decode, convert the set to a list.

python
1import json
2
3data = {"tags": {"python", "json", "api"}}
4serializable = {"tags": list(data["tags"])}
5
6print(json.dumps(serializable))

This is usually enough for APIs where the consumer only needs the values, not the fact that they originally came from a set.

Be Careful About Ordering

Sets are unordered, so converting them to a list can produce different element orders across runs. If deterministic output matters, sort the values before serialization.

python
1import json
2
3tags = {"python", "json", "api"}
4payload = {"tags": sorted(tags)}
5
6print(json.dumps(payload))

Sorting is especially useful in tests, caches, and snapshots where stable JSON output matters.

Use a Custom default Function

If sets appear throughout your data structures, manually converting each one becomes tedious. A cleaner approach is to give json.dumps a default function.

python
1import json
2
3def json_default(value):
4    if isinstance(value, set):
5        return sorted(value)
6    raise TypeError(f"Unsupported type: {type(value)!r}")
7
8data = {
9    "tags": {"python", "json"},
10    "admins": {"ava", "noah"}
11}
12
13print(json.dumps(data, default=json_default))

Whenever the encoder encounters a set, it calls json_default and uses the returned list instead.

Custom JSONEncoder Class

If you want a reusable serializer object, subclass json.JSONEncoder.

python
1import json
2
3class SetEncoder(json.JSONEncoder):
4    def default(self, obj):
5        if isinstance(obj, set):
6            return list(obj)
7        return super().default(obj)
8
9data = {"values": {1, 2, 3}}
10print(json.dumps(data, cls=SetEncoder))

This is useful when you want a consistent encoding strategy across multiple modules.

Preserving the Fact That It Was a Set

Converting a set to a list loses type information. If you need round-trip reconstruction, encode the type explicitly.

python
1import json
2
3def encode_sets(obj):
4    if isinstance(obj, set):
5        return {"__type__": "set", "items": sorted(obj)}
6    raise TypeError(f"Unsupported type: {type(obj)!r}")
7
8text = json.dumps({"tags": {"python", "json"}}, default=encode_sets)
9print(text)

Then decode it with an object_hook:

python
1def decode_sets(obj):
2    if obj.get("__type__") == "set":
3        return set(obj["items"])
4    return obj
5
6restored = json.loads(text, object_hook=decode_sets)
7print(restored)
8print(type(restored["tags"]))

This costs more complexity, so use it only when you truly need to restore the original type.

Nested Structures

Sets often appear inside larger dictionaries or lists. A default function or custom encoder handles those nested cases automatically as long as the unsupported value eventually reaches the encoder callback.

python
1data = {
2    "groups": [
3        {"name": "admins", "members": {"ava", "mia"}},
4        {"name": "editors", "members": {"liam", "noah"}}
5    ]
6}
7
8print(json.dumps(data, default=json_default))

That is much easier than manually walking the entire structure yourself.

When JSON Is the Wrong Format

If the receiving system truly cares about a native set type, JSON may not be the best format. JSON is designed for interoperability, not lossless representation of every language-specific type. In many systems, the cleanest approach is to document that the field is a JSON array with unique values and enforce uniqueness at the application level.

Common Pitfalls

The most common mistake is assuming Python collections map directly to JSON types one-to-one. Another is forgetting that set-to-list conversion produces unstable ordering unless you sort first. Developers also sometimes claim they need “set serialization” when what they really need is a JSON array with unique entries. Finally, round-trip decoding is easy to overengineer if the consumer does not actually need to rebuild a Python set.

Summary

  • Python sets are not JSON serializable by default.
  • The simplest solution is to convert sets to lists before encoding.
  • Sort the values first if stable output matters.
  • Use default= or a custom JSONEncoder when sets appear throughout nested data.
  • Encode explicit type metadata only if you truly need to reconstruct a set during decoding.

Course illustration
Course illustration

All Rights Reserved.