Python
Pickle
Serialization
Data Storage
Object Persistence

Saving and loading objects and using pickle

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Python's pickle module is the standard built-in way to serialize many Python objects to bytes and load them back later. It is convenient, fast for Python-to-Python workflows, and widely used for caching, model checkpoints, and quick persistence of complex objects. The catch is that pickle is Python-specific and unsafe to load from untrusted sources, so it should be used deliberately rather than casually.

What Pickle Does

Pickling means turning a Python object into a byte stream. Unpickling means reconstructing the object from that byte stream.

That works well for many common Python objects such as:

  • lists, dicts, and sets
  • custom class instances
  • nested Python data structures

A minimal example looks like this:

python
1import pickle
2
3settings = {"host": "localhost", "port": 5432, "debug": True}
4
5with open("settings.pkl", "wb") as f:
6    pickle.dump(settings, f)
7
8with open("settings.pkl", "rb") as f:
9    restored = pickle.load(f)
10
11print(restored)

This is the core workflow: open in binary mode, dump, then later load.

Saving Custom Objects

Pickle can also store many custom class instances as long as their class definitions are importable when unpickling.

python
1import pickle
2
3class User:
4    def __init__(self, name, role):
5        self.name = name
6        self.role = role
7
8    def __repr__(self):
9        return f"User(name={self.name!r}, role={self.role!r})"
10
11user = User("Ava", "admin")
12
13with open("user.pkl", "wb") as f:
14    pickle.dump(user, f)
15
16with open("user.pkl", "rb") as f:
17    loaded_user = pickle.load(f)
18
19print(loaded_user)

This is convenient, but it also explains one compatibility rule: the loading environment must know how to import the class being restored.

Use the Highest Practical Protocol

Pickle supports multiple protocol versions. In most current Python code, using the highest available protocol is a good default.

python
1import pickle
2
3with open("data.pkl", "wb") as f:
4    pickle.dump({"numbers": [1, 2, 3]}, f, protocol=pickle.HIGHEST_PROTOCOL)

This usually improves efficiency. The tradeoff is that very old Python versions may not be able to read the result.

Security Is the Biggest Warning

Unpickling can execute arbitrary code as part of object reconstruction. That means you should never load pickle data from an untrusted source.

A safe rule is simple:

  • use pickle only for data you created or fully trust
  • do not use it as a general interchange format for external input

If you need a safer cross-language or user-facing format, JSON, CSV, or another constrained format is usually better.

Pickle Is Python-Specific

Pickle is excellent for Python-to-Python persistence, but it is a poor interoperability format. A JavaScript service, a Go binary, or a shell script cannot meaningfully consume a .pkl file.

That is why pickle is best for:

  • local caches
  • internal Python tools
  • temporary experiment artifacts
  • trusted application state snapshots

It is usually not the right answer for public APIs or long-term archival formats shared across languages.

Good File Handling Practices

Always use binary mode and context managers.

python
1import pickle
2
3scores = [0.98, 0.87, 0.91]
4
5with open("scores.pkl", "wb") as out_file:
6    pickle.dump(scores, out_file)
7
8with open("scores.pkl", "rb") as in_file:
9    print(pickle.load(in_file))

The with blocks guarantee the file is closed properly even if an exception is raised.

Know When Not to Use Pickle

If the data must be human-readable, portable across languages, or safe to accept from external users, pickle is usually the wrong tool. Likewise, if object compatibility across Python versions and code refactors matters strongly, a more stable serialization format may be easier to maintain.

Pickle is extremely practical, but its convenience comes from assuming a trusted Python environment on both ends.

Common Pitfalls

  • Unpickling data from an untrusted source and exposing the program to arbitrary code execution.
  • Forgetting that the class definitions used during pickling must still be available when loading the data later.
  • Opening the file in text mode instead of binary mode.
  • Using pickle as a cross-language interchange format when only Python understands it naturally.
  • Treating pickle files as stable forever even though code changes and Python-version differences can affect compatibility.

Summary

  • 'pickle serializes Python objects to bytes and reconstructs them later.'
  • It is convenient for trusted Python-to-Python persistence.
  • Use binary mode and a context manager when dumping and loading.
  • Never unpickle data from untrusted sources.
  • Pickle is powerful, but it is not a safe or portable format for every storage problem.

Course illustration
Course illustration

All Rights Reserved.