How to Count Leaf and Non-Leaf Nodes in a Nested Dictionary in Python
Nested dictionaries in Python are frequently used to represent tree-like structures such as organizational hierarchies, file systems, JSON data, decision trees, and graph relationships. A common operation on these structures is identifying leaf nodes (endpoints with no further nesting) and non-leaf nodes (nodes that contain further nested dictionaries).
In this guide, you'll learn efficient methods to count and identify leaf and non-leaf nodes in nested Python dictionaries, with clear explanations and practical examples.
Understanding Leaf vs. Non-Leaf Nodes
In a nested dictionary used as a tree:
- Leaf node: A value that is not a dictionary. it's a terminal value (e.g., an integer, string, or list).
- Non-leaf node: A value that is a dictionary. It contains further nested key-value pairs.
Example structure:
data = {
'a': {
'b': 1, # 'b' is a leaf (value is 1)
'c': { # 'c' is a non-leaf (value is a dict)
'd': 2, # 'd' is a leaf (value is 2)
'e': 3 # 'e' is a leaf (value is 3)
}
}
}
| Node | Type | Reason |
|---|---|---|
'a' | Non-leaf | Value is a dictionary |
'b' | Leaf | Value is 1 (not a dict) |
'c' | Non-leaf | Value is a dictionary |
'd' | Leaf | Value is 2 (not a dict) |
'e' | Leaf | Value is 3 (not a dict) |
Result: 2 non-leaf nodes, 3 leaf nodes.
Using Recursion with isinstance() (Recommended)
Recursion naturally mirrors the tree structure of nested dictionaries. For each value, check if it's a dictionary (non-leaf) and recurse into it, or count it as a leaf:
def count_nodes(data):
"""Count leaf and non-leaf nodes in a nested dictionary."""
counts = {"leaf": 0, "non-leaf": 0}
def _traverse(d):
for value in d.values():
if isinstance(value, dict):
counts["non-leaf"] += 1
_traverse(value)
else:
counts["leaf"] += 1
_traverse(data)
return counts
# Example
data = {
'a': {
'b': 1,
'c': {'d': {'e': 2, 'f': 1}},
'g': {'h': {'i': 2, 'j': 1}}
}
}
result = count_nodes(data)
print(f"Node counts: {result}")
Output:
Node counts: {'leaf': 5, 'non-leaf': 5}
How it works:
- The inner
_traversefunction iterates over each value in the dictionary. - If a value is a dictionary (
isinstance(value, dict)), it's counted as a non-leaf and the function recurses into it. - If a value is anything else, it's counted as a leaf.
- The
countsdictionary accumulates totals across all recursive calls.
Nested dictionaries are inherently recursive data structures. Recursive traversal produces clean, readable code that directly mirrors the structure being analyzed.
Using an Iterative Approach with a Stack
For very deeply nested dictionaries where recursion might hit Python's recursion limit, an iterative approach using a stack avoids stack overflow issues:
def count_nodes_iterative(data):
"""Count nodes iteratively using a stack (DFS)."""
counts = {"leaf": 0, "non-leaf": 0}
stack = [data]
while stack:
current = stack.pop()
for value in current.values():
if isinstance(value, dict):
counts["non-leaf"] += 1
stack.append(value)
else:
counts["leaf"] += 1
return counts
data = {
'a': {
'b': 1,
'c': {'d': {'e': 2, 'f': 1}},
'g': {'h': {'i': 2, 'j': 1}}
}
}
result = count_nodes_iterative(data)
print(f"Node counts: {result}")
Output:
Node counts: {'leaf': 5, 'non-leaf': 5}
How it works:
- The root dictionary is pushed onto the stack.
- In each iteration, a dictionary is popped and its values are examined.
- Dictionary values are pushed onto the stack for further processing.
- Non-dictionary values are counted as leaves.
This performs a depth-first traversal without recursion.
Python's default recursion limit is 1,000. If your nested dictionaries could be deeper than that, use the iterative approach. For typical use cases, recursion is cleaner and preferred.
Collecting Node Names Instead of Just Counting
Sometimes you need to know which nodes are leaves and which aren't, not just the counts:
def classify_nodes(data, path=""):
"""Classify each node as leaf or non-leaf with its path."""
leaves = []
non_leaves = []
def _traverse(d, current_path):
for key, value in d.items():
node_path = f"{current_path}.{key}" if current_path else key
if isinstance(value, dict):
non_leaves.append(node_path)
_traverse(value, node_path)
else:
leaves.append((node_path, value))
_traverse(data, path)
return {"leaves": leaves, "non_leaves": non_leaves}
data = {
'a': {
'b': 1,
'c': {'d': {'e': 2, 'f': 1}},
'g': {'h': {'i': 2, 'j': 1}}
}
}
result = classify_nodes(data)
print("Non-leaf nodes:")
for node in result["non_leaves"]:
print(f" {node}")
print("\nLeaf nodes:")
for node, value in result["leaves"]:
print(f" {node} = {value}")
Output:
Non-leaf nodes:
a
a.c
a.c.d
a.g
a.g.h
Leaf nodes:
a.b = 1
a.c.d.e = 2
a.c.d.f = 1
a.g.h.i = 2
a.g.h.j = 1
Common Mistake: Mutable Default Arguments in Recursive Functions
A subtle Python bug occurs when using a mutable default argument (like a dictionary) in a recursive function. The default is shared across all calls to the function, causing incorrect results on subsequent invocations.
Wrong approach: mutable default argument
def count_nodes_buggy(data, counts={"leaf": 0, "non-leaf": 0}):
for value in data.values():
if isinstance(value, dict):
counts["non-leaf"] += 1
count_nodes_buggy(value, counts)
else:
counts["leaf"] += 1
return counts
data = {'a': {'b': 1, 'c': 2}}
# First call: correct
print(count_nodes_buggy(data))
# Second call: WRONG! Counts accumulate from the first call
print(count_nodes_buggy(data))
Output:
{'leaf': 2, 'non-leaf': 1}
{'leaf': 4, 'non-leaf': 2}
The second call shows doubled counts because the same dictionary object persists between calls.
Correct approach: initialize inside the function
def count_nodes(data):
counts = {"leaf": 0, "non-leaf": 0} # Fresh dict each call
def _traverse(d):
for value in d.values():
if isinstance(value, dict):
counts["non-leaf"] += 1
_traverse(value)
else:
counts["leaf"] += 1
_traverse(data)
return counts
data = {'a': {'b': 1, 'c': 2}}
print(count_nodes(data)) # {'leaf': 2, 'non-leaf': 1}
print(count_nodes(data)) # {'leaf': 2, 'non-leaf': 1} (correct!)
Mutable defaults like {} or [] in function signatures are shared across all calls. Always initialize mutable objects inside the function body.
Handling Edge Cases
Empty Dictionary
data = {}
result = count_nodes(data)
print(result)
Output:
{'leaf': 0, 'non-leaf': 0}
Flat Dictionary (No Nesting)
data = {'x': 1, 'y': 2, 'z': 3}
result = count_nodes(data)
print(result)
Output:
{'leaf': 3, 'non-leaf': 0}
Deeply Nested Dictionary
data = {'a': {'b': {'c': {'d': {'e': 1}}}}}
result = count_nodes(data)
print(result)
Output:
{'leaf': 1, 'non-leaf': 4}
Quick Comparison of Methods
| Method | Readability | Handles Deep Nesting | Extra Import |
|---|---|---|---|
| Recursion | ⭐⭐⭐ High | Up to recursion limit (~1000) | None |
| Stack (iterative DFS) | ⭐⭐ Medium | ✅ Unlimited depth | None |
| Queue (iterative BFS) | ⭐⭐ Medium | ✅ Unlimited depth | collections.deque |
All methods have O(n) time complexity and O(n) space complexity, where n is the total number of key-value pairs in the dictionary.
Conclusion
Counting leaf and non-leaf nodes in a nested Python dictionary is straightforward with the right traversal approach:
- Recursion with
isinstance()is the most readable and natural method for typical nested dictionaries. - Iterative traversal with a stack is safer for very deeply nested structures that might exceed Python's recursion limit.
- Use path tracking when you need to identify which specific nodes are leaves vs. non-leaves.
- Never use mutable default arguments in recursive functions. Initialize fresh objects inside the function body.
- Handle edge cases like empty dictionaries and flat (non-nested) dictionaries gracefully.