How to JSON serialize sets?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
JSON has arrays, objects, strings, numbers, booleans, and null, but it does not have a native set type. That is why Python’s json module raises an error when you try to serialize a set directly. The usual fix is to convert the set into a JSON-compatible structure first, most often a list.
Why Sets Fail with json.dumps
Python sets are unordered collections of unique values. The standard JSON encoder does not know how to turn them into one of JSON’s supported types.
The error is expected: a set is not JSON serializable by default.
Simplest Fix: Convert the Set to a List
If you only need to emit JSON and do not care about preserving the type on decode, convert the set to a list.
This is usually enough for APIs where the consumer only needs the values, not the fact that they originally came from a set.
Be Careful About Ordering
Sets are unordered, so converting them to a list can produce different element orders across runs. If deterministic output matters, sort the values before serialization.
Sorting is especially useful in tests, caches, and snapshots where stable JSON output matters.
Use a Custom default Function
If sets appear throughout your data structures, manually converting each one becomes tedious. A cleaner approach is to give json.dumps a default function.
Whenever the encoder encounters a set, it calls json_default and uses the returned list instead.
Custom JSONEncoder Class
If you want a reusable serializer object, subclass json.JSONEncoder.
This is useful when you want a consistent encoding strategy across multiple modules.
Preserving the Fact That It Was a Set
Converting a set to a list loses type information. If you need round-trip reconstruction, encode the type explicitly.
Then decode it with an object_hook:
This costs more complexity, so use it only when you truly need to restore the original type.
Nested Structures
Sets often appear inside larger dictionaries or lists. A default function or custom encoder handles those nested cases automatically as long as the unsupported value eventually reaches the encoder callback.
That is much easier than manually walking the entire structure yourself.
When JSON Is the Wrong Format
If the receiving system truly cares about a native set type, JSON may not be the best format. JSON is designed for interoperability, not lossless representation of every language-specific type. In many systems, the cleanest approach is to document that the field is a JSON array with unique values and enforce uniqueness at the application level.
Common Pitfalls
The most common mistake is assuming Python collections map directly to JSON types one-to-one. Another is forgetting that set-to-list conversion produces unstable ordering unless you sort first. Developers also sometimes claim they need “set serialization” when what they really need is a JSON array with unique entries. Finally, round-trip decoding is easy to overengineer if the consumer does not actually need to rebuild a Python set.
Summary
- Python sets are not JSON serializable by default.
- The simplest solution is to convert sets to lists before encoding.
- Sort the values first if stable output matters.
- Use
default=or a customJSONEncoderwhen sets appear throughout nested data. - Encode explicit type metadata only if you truly need to reconstruct a set during decoding.

