pandas
DataFrame
dictionary conversion
Python
data manipulation

Convert a Pandas DataFrame to a dictionary

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Pandas makes it easy to convert a DataFrame into dictionary-shaped data, but the exact result depends on the orientation you choose. The right option depends on whether you want column-wise output, row-wise records, or a structure that preserves both index and column metadata.

The Main API

The standard tool is DataFrame.to_dict(). Its orient argument controls the shape of the output.

python
1import pandas as pd
2
3df = pd.DataFrame(
4    {
5        "name": ["Ada", "Grace"],
6        "age": [36, 47]
7    },
8    index=["a", "b"]
9)
10
11print(df.to_dict())

The default orientation is column-centric. The result looks like a dictionary of dictionaries, where each top-level key is a column name.

Useful Orientations

The most commonly used orientations are:

  • 'dict'
  • 'list'
  • 'records'
  • 'index'
  • 'split'

records for JSON-like Rows

records is often the most useful when you want one dictionary per row.

python
print(df.to_dict(orient="records"))

Output:

python
[{"name": "Ada", "age": 36}, {"name": "Grace", "age": 47}]

This is a good fit for APIs, templating, and test fixtures.

list for Column Arrays

If another tool expects each column as a simple list, use list:

python
print(df.to_dict(orient="list"))

Output:

python
{"name": ["Ada", "Grace"], "age": [36, 47]}

index for Lookup by Index

When the index is meaningful, index makes each index value a top-level key.

python
print(df.to_dict(orient="index"))

Output:

python
{"a": {"name": "Ada", "age": 36}, "b": {"name": "Grace", "age": 47}}

This is convenient when you want quick lookup by row label.

split for Structure Preservation

split separates the index, column labels, and row data into distinct keys.

python
print(df.to_dict(orient="split"))

Output:

python
1{
2    "index": ["a", "b"],
3    "columns": ["name", "age"],
4    "data": [["Ada", 36], ["Grace", 47]]
5}

This is useful when you need to serialize and later reconstruct the tabular structure cleanly.

Choosing the Right Shape

A practical rule of thumb is:

  • use records for row-oriented downstream code
  • use list for column-wise plotting or simple payloads
  • use index when the index has business meaning
  • use split when you need explicit structure metadata

If you just need a quick Python mapping and do not care about shape very much, start with records or the default dict orientation.

Watch Data Types

Conversion shape is only part of the story. Pandas may contain NumPy scalar types, timestamps, missing values, or custom dtypes that are fine in Python but awkward for JSON serialization. Converting to a dictionary does not automatically make every value API-friendly.

For example, timestamps may need string formatting before serialization:

python
df = pd.DataFrame({"created_at": pd.to_datetime(["2025-01-01", "2025-01-02"])})
df["created_at"] = df["created_at"].dt.strftime("%Y-%m-%d")
print(df.to_dict(orient="records"))

If the dictionary is only an intermediate step inside Python, you may not care. If it is leaving the Python process, decide early whether you need plain strings, native numbers, or a JSON encoder that understands pandas-related types.

Common Pitfalls

  • Using the default orientation and then being surprised that the result is column-oriented instead of row-oriented.
  • Assuming to_dict() makes data immediately JSON-safe. Timestamps, NaN, and NumPy values may still need cleanup.
  • Forgetting that orient="index" depends on unique index labels for reliable lookup semantics.
  • Choosing records for very large frames when a streamed export or direct file format would be more memory-efficient.

Summary

  • Use DataFrame.to_dict() to convert a pandas DataFrame into dictionary-shaped data.
  • Pick orient based on the structure the next step expects.
  • 'records is the most common row-wise output format.'
  • 'list and dict are convenient for column-oriented processing.'
  • Clean up special dtypes if the dictionary will be serialized outside Python.

Course illustration
Course illustration

All Rights Reserved.