defaultdict of defaultdict?

Python

defaultdict

nested dictionaries

Python collections

programming tips

defaultdict of defaultdict?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

A defaultdict of defaultdict creates a nested dictionary where accessing missing keys at any level automatically creates the intermediate structure instead of raising KeyError. The standard approach is defaultdict(lambda: defaultdict(list)) for a two-level nesting, but for arbitrary depth you can define a recursive factory. This pattern is useful for building tree-like structures, accumulating grouped data, and parsing hierarchical configurations without manual key-existence checks.

Basic `defaultdict` Review

python

1from collections import defaultdict
2
3# Standard defaultdict with list factory
4d = defaultdict(list)
5d["fruits"].append("apple")
6d["fruits"].append("banana")
7d["vegetables"].append("carrot")
8
9print(d)
10# defaultdict(<class 'list'>, {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']})
11
12# No need to check if key exists — it creates the list automatically

Two-Level Nesting

python

1from collections import defaultdict
2
3# defaultdict of defaultdict(list)
4data = defaultdict(lambda: defaultdict(list))
5
6data["2025"]["January"].append("New Year")
7data["2025"]["March"].append("Spring Break")
8data["2025"]["March"].append("St. Patrick's Day")
9data["2024"]["December"].append("Christmas")
10
11print(data["2025"]["March"])
12# ['Spring Break', "St. Patrick's Day"]
13
14# Accessing a missing key creates empty structures without error
15print(data["2026"]["June"])
16# []

The outer defaultdict uses a lambda because defaultdict(list) is not callable as a type — you need a zero-argument function that returns a new defaultdict(list).

Why Not `defaultdict(defaultdict(list))`?

python

1# This does NOT work
2data = defaultdict(defaultdict(list))
3# TypeError: first argument must be callable or None
4
5# defaultdict(list) is an INSTANCE, not a callable
6# The default_factory must be a CALLABLE (function/class)
7
8# Correct: wrap in a lambda
9data = defaultdict(lambda: defaultdict(list))
10
11# Or use a named function
12def list_dict():
13    return defaultdict(list)
14
15data = defaultdict(list_dict)

Arbitrary Depth with Recursive Factory

python

1from collections import defaultdict
2
3def tree():
4    """A recursive defaultdict that creates nested dicts on access."""
5    return defaultdict(tree)
6
7# Infinitely nestable dictionary
8t = tree()
9t["animals"]["mammals"]["dogs"]["breeds"].append  # Would need list at the leaf
10
11# More practical: recursive with list leaves
12def nested_list():
13    return defaultdict(nested_list)
14
15config = tree()
16config["database"]["primary"]["host"] = "db1.example.com"
17config["database"]["primary"]["port"] = 5432
18config["database"]["replica"]["host"] = "db2.example.com"
19config["cache"]["redis"]["host"] = "redis.example.com"
20
21print(config["database"]["primary"]["host"])
22# db1.example.com

Converting to Regular Dict

defaultdict creates empty entries on access, which can be surprising. Convert to a regular dict when you are done building the structure.

python

1import json
2from collections import defaultdict
3
4data = defaultdict(lambda: defaultdict(list))
5data["users"]["admins"].append("Alice")
6data["users"]["editors"].append("Bob")
7
8# Convert nested defaultdict to regular dict
9def to_regular_dict(d):
10    if isinstance(d, defaultdict):
11        return {k: to_regular_dict(v) for k, v in d.items()}
12    return d
13
14regular = to_regular_dict(data)
15print(json.dumps(regular, indent=2))
16# {
17#   "users": {
18#     "admins": ["Alice"],
19#     "editors": ["Bob"]
20#   }
21# }
22
23# Quick one-level conversion
24regular = {k: dict(v) for k, v in data.items()}

Practical Examples

Grouping Data by Multiple Keys

python

1from collections import defaultdict
2
3sales = [
4    ("2025", "Q1", "Electronics", 1500),
5    ("2025", "Q1", "Clothing", 800),
6    ("2025", "Q2", "Electronics", 1700),
7    ("2024", "Q4", "Electronics", 1200),
8]
9
10grouped = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
11
12for year, quarter, category, amount in sales:
13    grouped[year][quarter][category] += amount
14
15print(grouped["2025"]["Q1"]["Electronics"])  # 1500
16print(grouped["2025"]["Q1"]["Clothing"])     # 800

Building an Adjacency List

python

1from collections import defaultdict
2
3edges = [("A", "B"), ("A", "C"), ("B", "C"), ("C", "D")]
4
5graph = defaultdict(lambda: defaultdict(int))
6
7for src, dst in edges:
8    graph[src][dst] = 1  # Unweighted
9    graph[dst][src] = 1  # Undirected
10
11print(dict(graph["A"]))  # {'B': 1, 'C': 1}

Common Pitfalls

Passing an instance instead of a callable: defaultdict(defaultdict(list)) raises TypeError because defaultdict(list) is an instance, not a function. Use defaultdict(lambda: defaultdict(list)) — the lambda is the callable factory.
Accidental key creation on read access: if data["missing_key"] creates the key with a default value as a side effect. Use if "missing_key" in data to check existence without creating the entry. This is especially problematic in nested defaultdicts where entire subtrees get created accidentally.
Lambda pickling issues: Lambdas are not picklable, so defaultdict(lambda: defaultdict(list)) cannot be serialized with pickle or used with multiprocessing. Use a named function or functools.partial instead: defaultdict(partial(defaultdict, list)).
Forgetting to convert before serialization: json.dumps(data) fails on defaultdict because JSON does not understand defaultdict. Convert to regular dicts before serializing to JSON, YAML, or other formats.
Deep nesting making code hard to understand: More than 2-3 levels of nested defaultdict becomes unreadable. For complex hierarchical data, use dataclasses, named tuples, or a proper tree/config class instead of deeply nested defaultdicts.

Summary

Use defaultdict(lambda: defaultdict(list)) for two-level nested auto-creating dicts
The factory argument must be a callable (function/class), not an instance
Use a recursive tree() function for arbitrary-depth nesting
Convert to regular dict before serialization or when done building the structure
Use named functions instead of lambdas if you need pickling support

defaultdict of defaultdict?

Master System Design with Codemia

Introduction

Basic defaultdict Review

Two-Level Nesting

Why Not defaultdict(defaultdict(list))?

Arbitrary Depth with Recursive Factory

Converting to Regular Dict

Practical Examples

Grouping Data by Multiple Keys

Building an Adjacency List

Common Pitfalls

Summary

Basic `defaultdict` Review

Why Not `defaultdict(defaultdict(list))`?