Python
defaultdict
nested dictionaries
Python collections
programming tips

defaultdict of defaultdict?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

A defaultdict of defaultdict creates a nested dictionary where accessing missing keys at any level automatically creates the intermediate structure instead of raising KeyError. The standard approach is defaultdict(lambda: defaultdict(list)) for a two-level nesting, but for arbitrary depth you can define a recursive factory. This pattern is useful for building tree-like structures, accumulating grouped data, and parsing hierarchical configurations without manual key-existence checks.

Basic defaultdict Review

python
1from collections import defaultdict
2
3# Standard defaultdict with list factory
4d = defaultdict(list)
5d["fruits"].append("apple")
6d["fruits"].append("banana")
7d["vegetables"].append("carrot")
8
9print(d)
10# defaultdict(<class 'list'>, {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']})
11
12# No need to check if key exists — it creates the list automatically

Two-Level Nesting

python
1from collections import defaultdict
2
3# defaultdict of defaultdict(list)
4data = defaultdict(lambda: defaultdict(list))
5
6data["2025"]["January"].append("New Year")
7data["2025"]["March"].append("Spring Break")
8data["2025"]["March"].append("St. Patrick's Day")
9data["2024"]["December"].append("Christmas")
10
11print(data["2025"]["March"])
12# ['Spring Break', "St. Patrick's Day"]
13
14# Accessing a missing key creates empty structures without error
15print(data["2026"]["June"])
16# []

The outer defaultdict uses a lambda because defaultdict(list) is not callable as a type — you need a zero-argument function that returns a new defaultdict(list).

Why Not defaultdict(defaultdict(list))?

python
1# This does NOT work
2data = defaultdict(defaultdict(list))
3# TypeError: first argument must be callable or None
4
5# defaultdict(list) is an INSTANCE, not a callable
6# The default_factory must be a CALLABLE (function/class)
7
8# Correct: wrap in a lambda
9data = defaultdict(lambda: defaultdict(list))
10
11# Or use a named function
12def list_dict():
13    return defaultdict(list)
14
15data = defaultdict(list_dict)

Arbitrary Depth with Recursive Factory

python
1from collections import defaultdict
2
3def tree():
4    """A recursive defaultdict that creates nested dicts on access."""
5    return defaultdict(tree)
6
7# Infinitely nestable dictionary
8t = tree()
9t["animals"]["mammals"]["dogs"]["breeds"].append  # Would need list at the leaf
10
11# More practical: recursive with list leaves
12def nested_list():
13    return defaultdict(nested_list)
14
15config = tree()
16config["database"]["primary"]["host"] = "db1.example.com"
17config["database"]["primary"]["port"] = 5432
18config["database"]["replica"]["host"] = "db2.example.com"
19config["cache"]["redis"]["host"] = "redis.example.com"
20
21print(config["database"]["primary"]["host"])
22# db1.example.com

Converting to Regular Dict

defaultdict creates empty entries on access, which can be surprising. Convert to a regular dict when you are done building the structure.

python
1import json
2from collections import defaultdict
3
4data = defaultdict(lambda: defaultdict(list))
5data["users"]["admins"].append("Alice")
6data["users"]["editors"].append("Bob")
7
8# Convert nested defaultdict to regular dict
9def to_regular_dict(d):
10    if isinstance(d, defaultdict):
11        return {k: to_regular_dict(v) for k, v in d.items()}
12    return d
13
14regular = to_regular_dict(data)
15print(json.dumps(regular, indent=2))
16# {
17#   "users": {
18#     "admins": ["Alice"],
19#     "editors": ["Bob"]
20#   }
21# }
22
23# Quick one-level conversion
24regular = {k: dict(v) for k, v in data.items()}

Practical Examples

Grouping Data by Multiple Keys

python
1from collections import defaultdict
2
3sales = [
4    ("2025", "Q1", "Electronics", 1500),
5    ("2025", "Q1", "Clothing", 800),
6    ("2025", "Q2", "Electronics", 1700),
7    ("2024", "Q4", "Electronics", 1200),
8]
9
10grouped = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
11
12for year, quarter, category, amount in sales:
13    grouped[year][quarter][category] += amount
14
15print(grouped["2025"]["Q1"]["Electronics"])  # 1500
16print(grouped["2025"]["Q1"]["Clothing"])     # 800

Building an Adjacency List

python
1from collections import defaultdict
2
3edges = [("A", "B"), ("A", "C"), ("B", "C"), ("C", "D")]
4
5graph = defaultdict(lambda: defaultdict(int))
6
7for src, dst in edges:
8    graph[src][dst] = 1  # Unweighted
9    graph[dst][src] = 1  # Undirected
10
11print(dict(graph["A"]))  # {'B': 1, 'C': 1}

Common Pitfalls

  • Passing an instance instead of a callable: defaultdict(defaultdict(list)) raises TypeError because defaultdict(list) is an instance, not a function. Use defaultdict(lambda: defaultdict(list)) — the lambda is the callable factory.
  • Accidental key creation on read access: if data["missing_key"] creates the key with a default value as a side effect. Use if "missing_key" in data to check existence without creating the entry. This is especially problematic in nested defaultdicts where entire subtrees get created accidentally.
  • Lambda pickling issues: Lambdas are not picklable, so defaultdict(lambda: defaultdict(list)) cannot be serialized with pickle or used with multiprocessing. Use a named function or functools.partial instead: defaultdict(partial(defaultdict, list)).
  • Forgetting to convert before serialization: json.dumps(data) fails on defaultdict because JSON does not understand defaultdict. Convert to regular dicts before serializing to JSON, YAML, or other formats.
  • Deep nesting making code hard to understand: More than 2-3 levels of nested defaultdict becomes unreadable. For complex hierarchical data, use dataclasses, named tuples, or a proper tree/config class instead of deeply nested defaultdicts.

Summary

  • Use defaultdict(lambda: defaultdict(list)) for two-level nested auto-creating dicts
  • The factory argument must be a callable (function/class), not an instance
  • Use a recursive tree() function for arbitrary-depth nesting
  • Convert to regular dict before serialization or when done building the structure
  • Use named functions instead of lambdas if you need pickling support

Course illustration
Course illustration

All Rights Reserved.