Introduction
A defaultdict of defaultdict creates a nested dictionary where accessing missing keys at any level automatically creates the intermediate structure instead of raising KeyError. The standard approach is defaultdict(lambda: defaultdict(list)) for a two-level nesting, but for arbitrary depth you can define a recursive factory. This pattern is useful for building tree-like structures, accumulating grouped data, and parsing hierarchical configurations without manual key-existence checks.
Basic defaultdict Review
1from collections import defaultdict
2
3# Standard defaultdict with list factory
4d = defaultdict(list)
5d["fruits"].append("apple")
6d["fruits"].append("banana")
7d["vegetables"].append("carrot")
8
9print(d)
10# defaultdict(<class 'list'>, {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']})
11
12# No need to check if key exists — it creates the list automatically
Two-Level Nesting
1from collections import defaultdict
2
3# defaultdict of defaultdict(list)
4data = defaultdict(lambda: defaultdict(list))
5
6data["2025"]["January"].append("New Year")
7data["2025"]["March"].append("Spring Break")
8data["2025"]["March"].append("St. Patrick's Day")
9data["2024"]["December"].append("Christmas")
10
11print(data["2025"]["March"])
12# ['Spring Break', "St. Patrick's Day"]
13
14# Accessing a missing key creates empty structures without error
15print(data["2026"]["June"])
16# []
The outer defaultdict uses a lambda because defaultdict(list) is not callable as a type — you need a zero-argument function that returns a new defaultdict(list).
Why Not defaultdict(defaultdict(list))?
1# This does NOT work
2data = defaultdict(defaultdict(list))
3# TypeError: first argument must be callable or None
4
5# defaultdict(list) is an INSTANCE, not a callable
6# The default_factory must be a CALLABLE (function/class)
7
8# Correct: wrap in a lambda
9data = defaultdict(lambda: defaultdict(list))
10
11# Or use a named function
12def list_dict():
13 return defaultdict(list)
14
15data = defaultdict(list_dict)
Arbitrary Depth with Recursive Factory
1from collections import defaultdict
2
3def tree():
4 """A recursive defaultdict that creates nested dicts on access."""
5 return defaultdict(tree)
6
7# Infinitely nestable dictionary
8t = tree()
9t["animals"]["mammals"]["dogs"]["breeds"].append # Would need list at the leaf
10
11# More practical: recursive with list leaves
12def nested_list():
13 return defaultdict(nested_list)
14
15config = tree()
16config["database"]["primary"]["host"] = "db1.example.com"
17config["database"]["primary"]["port"] = 5432
18config["database"]["replica"]["host"] = "db2.example.com"
19config["cache"]["redis"]["host"] = "redis.example.com"
20
21print(config["database"]["primary"]["host"])
22# db1.example.com
Converting to Regular Dict
defaultdict creates empty entries on access, which can be surprising. Convert to a regular dict when you are done building the structure.
1import json
2from collections import defaultdict
3
4data = defaultdict(lambda: defaultdict(list))
5data["users"]["admins"].append("Alice")
6data["users"]["editors"].append("Bob")
7
8# Convert nested defaultdict to regular dict
9def to_regular_dict(d):
10 if isinstance(d, defaultdict):
11 return {k: to_regular_dict(v) for k, v in d.items()}
12 return d
13
14regular = to_regular_dict(data)
15print(json.dumps(regular, indent=2))
16# {
17# "users": {
18# "admins": ["Alice"],
19# "editors": ["Bob"]
20# }
21# }
22
23# Quick one-level conversion
24regular = {k: dict(v) for k, v in data.items()}
Practical Examples
Grouping Data by Multiple Keys
1from collections import defaultdict
2
3sales = [
4 ("2025", "Q1", "Electronics", 1500),
5 ("2025", "Q1", "Clothing", 800),
6 ("2025", "Q2", "Electronics", 1700),
7 ("2024", "Q4", "Electronics", 1200),
8]
9
10grouped = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
11
12for year, quarter, category, amount in sales:
13 grouped[year][quarter][category] += amount
14
15print(grouped["2025"]["Q1"]["Electronics"]) # 1500
16print(grouped["2025"]["Q1"]["Clothing"]) # 800
Building an Adjacency List
1from collections import defaultdict
2
3edges = [("A", "B"), ("A", "C"), ("B", "C"), ("C", "D")]
4
5graph = defaultdict(lambda: defaultdict(int))
6
7for src, dst in edges:
8 graph[src][dst] = 1 # Unweighted
9 graph[dst][src] = 1 # Undirected
10
11print(dict(graph["A"])) # {'B': 1, 'C': 1}
Common Pitfalls
Passing an instance instead of a callable: defaultdict(defaultdict(list)) raises TypeError because defaultdict(list) is an instance, not a function. Use defaultdict(lambda: defaultdict(list)) — the lambda is the callable factory.
Accidental key creation on read access: if data["missing_key"] creates the key with a default value as a side effect. Use if "missing_key" in data to check existence without creating the entry. This is especially problematic in nested defaultdicts where entire subtrees get created accidentally.
Lambda pickling issues: Lambdas are not picklable, so defaultdict(lambda: defaultdict(list)) cannot be serialized with pickle or used with multiprocessing. Use a named function or functools.partial instead: defaultdict(partial(defaultdict, list)).
Forgetting to convert before serialization: json.dumps(data) fails on defaultdict because JSON does not understand defaultdict. Convert to regular dicts before serializing to JSON, YAML, or other formats.
Deep nesting making code hard to understand: More than 2-3 levels of nested defaultdict becomes unreadable. For complex hierarchical data, use dataclasses, named tuples, or a proper tree/config class instead of deeply nested defaultdicts.
Summary
Use defaultdict(lambda: defaultdict(list)) for two-level nested auto-creating dicts
The factory argument must be a callable (function/class), not an instance
Use a recursive tree() function for arbitrary-depth nesting
Convert to regular dict before serialization or when done building the structure
Use named functions instead of lambdas if you need pickling support