How to Convert a List of Paths to a Tree Dictionary in Python
Path-like data, such as file system paths, URL structures, or category hierarchies, often arrives as flat lists like ['a', 'b', 'c']. Converting these into nested dictionaries like {'a': {'b': {'c': {}}}} enables natural navigation, manipulation, and visualization of hierarchical relationships.
In this guide, you will learn multiple approaches to build tree dictionaries from path lists, ranging from procedural to functional styles, along with practical techniques for handling values, metadata, and string paths.
Iterative Pointer Approach (Recommended)
The most straightforward method builds the tree by moving a reference (pointer) deeper into the nested dictionary as it processes each segment of a path:
import json
paths = [
["home", "user", "documents"],
["home", "user", "images"],
["home", "admin", "config"],
["etc", "nginx"]
]
tree = {}
for path in paths:
current = tree
for node in path:
if node not in current:
current[node] = {}
current = current[node]
print(json.dumps(tree, indent=2))
Output:
{
"home": {
"user": {
"documents": {},
"images": {}
},
"admin": {
"config": {}
}
},
"etc": {
"nginx": {}
}
}
How it works:
- For each path,
currentstarts pointing at the root of the tree - For each node in the path, a new empty dictionary is created if that key does not already exist
currentthen moves one level deeper by reassigning itself to the child dictionary- When paths share common prefixes (like
["home", "user"]), the existing branches are reused rather than overwritten
Attaching Values at Leaf Nodes
Often you need to associate data with the end of each path rather than leaving empty dictionaries at the leaves:
import json
path_data = [
(["home", "user", "file.txt"], "content1"),
(["home", "user", "file2.txt"], "content2"),
(["etc", "config.ini"], "settings")
]
tree = {}
for path, value in path_data:
current = tree
for node in path[:-1]: # Navigate to the parent of the leaf
if node not in current:
current[node] = {}
current = current[node]
current[path[-1]] = value # Set the value at the leaf
print(json.dumps(tree, indent=2))
Output:
{
"home": {
"user": {
"file.txt": "content1",
"file2.txt": "content2"
}
},
"etc": {
"config.ini": "settings"
}
}
Notice how path[:-1] navigates through all segments except the last, and then path[-1] is used to set the final key to the actual value instead of an empty dictionary.
Using defaultdict (Autovivification)
A defaultdict that references itself as its default factory creates a self-expanding tree structure. Any access to a missing key automatically creates a new nested level:
import json
from collections import defaultdict
def make_tree():
"""Create a recursively auto-generating dictionary."""
return defaultdict(make_tree)
tree = make_tree()
paths = [
["products", "electronics", "phones"],
["products", "electronics", "laptops"],
["products", "clothing", "shirts"]
]
for path in paths:
current = tree
for node in path:
current = current[node] # No need to check existence
# Convert to regular dict for display and serialization
def to_regular_dict(d):
if isinstance(d, defaultdict):
return {k: to_regular_dict(v) for k, v in d.items()}
return d
print(json.dumps(to_regular_dict(tree), indent=2))
Output:
{
"products": {
"electronics": {
"phones": {},
"laptops": {}
},
"clothing": {
"shirts": {}
}
}
}
The key advantage is that you never need to check whether a key exists before descending into it. The defaultdict creates intermediate nodes automatically.
The defaultdict approach requires a conversion step back to regular dictionaries before serialization with json.dumps(), since the JSON module does not know how to handle defaultdict objects directly.
Compact Lambda Version
The same concept can be expressed as a one-liner using a lambda:
from collections import defaultdict
# Convert to regular dict for display and serialization
def to_regular_dict(d):
if isinstance(d, defaultdict):
return {k: to_regular_dict(v) for k, v in d.items()}
return d
Tree = lambda: defaultdict(Tree)
tree = Tree()
tree["a"]["b"]["c"]["d"] # All intermediate levels are auto-created
print(to_regular_dict(tree))
Output:
{'a': {'b': {'c': {'d': {}}}}}
Using reduce() (Functional Style)
For a compact, functional approach, reduce() from functools can walk through each path and build the tree using dict.setdefault():
import json
from functools import reduce
paths = [
["root", "branch", "leaf1"],
["root", "branch", "leaf2"],
["root", "other"]
]
tree = {}
for path in paths:
reduce(lambda node, key: node.setdefault(key, {}), path, tree)
print(json.dumps(tree, indent=2))
Output:
{
"root": {
"branch": {
"leaf1": {},
"leaf2": {}
},
"other": {}
}
}
How it works:
reduce()takes the tree as the initial accumulator value- For each key in the path,
setdefault(key, {})either returns the existing dictionary at that key or creates a new empty one - The returned dictionary becomes the accumulator for the next key, effectively descending one level deeper
The setdefault() method is what makes this work so cleanly. It combines the "check if key exists" and "create if missing" steps into a single call.
Parsing String Paths
Real-world paths often arrive as strings with a separator rather than pre-split lists. A small wrapper handles the splitting:
import json
def string_paths_to_tree(string_paths, separator="/"):
"""Convert string paths like '/a/b/c' to a tree dictionary."""
tree = {}
for path_str in string_paths:
parts = [p for p in path_str.split(separator) if p]
current = tree
for part in parts:
current = current.setdefault(part, {})
return tree
paths = [
"/home/user/docs",
"/home/user/images",
"/etc/nginx/conf.d",
"/var/log"
]
tree = string_paths_to_tree(paths)
print(json.dumps(tree, indent=2))
Output:
{
"home": {
"user": {
"docs": {},
"images": {}
}
},
"etc": {
"nginx": {
"conf.d": {}
}
},
"var": {
"log": {}
}
}
The list comprehension [p for p in path_str.split(separator) if p] filters out empty strings that result from leading separators or double separators.
Reusable Conversion Function
A general-purpose function that handles both plain paths and paths with associated values:
import json
def paths_to_tree(paths, value_func=None):
"""
Convert a list of paths to a nested dictionary.
Args:
paths: List of path lists, or list of (path, value) tuples.
value_func: Optional function to generate leaf values from the path.
"""
tree = {}
for item in paths:
if isinstance(item, tuple):
path, value = item
else:
path = item
value = {} if value_func is None else value_func(path)
current = tree
for node in path[:-1]:
current = current.setdefault(node, {})
current[path[-1]] = value
return tree
# Basic usage with plain paths
paths = [["a", "b"], ["a", "c"]]
print(paths_to_tree(paths))
# With associated values
paths_with_values = [
(["users", "alice"], {"id": 1}),
(["users", "bob"], {"id": 2})
]
print(json.dumps(paths_to_tree(paths_with_values), indent=2))
Output:
{'a': {'b': {}, 'c': {}}}
{
"users": {
"alice": {
"id": 1
},
"bob": {
"id": 2
}
}
}
Reverse Operation: Tree Back to Paths
To convert a nested dictionary back into a flat list of paths, use recursion to walk through every branch:
def tree_to_paths(tree, current_path=None):
"""Convert a nested dictionary back to a list of paths."""
if current_path is None:
current_path = []
paths = []
for key, value in tree.items():
new_path = current_path + [key]
if isinstance(value, dict) and value:
# Non-empty dict means more children to explore
paths.extend(tree_to_paths(value, new_path))
else:
# Leaf node (empty dict, None, or a value)
paths.append(new_path)
return paths
tree = {
"a": {
"b": {"c": {}},
"d": {}
},
"e": {}
}
paths = tree_to_paths(tree)
print(paths)
Output:
[['a', 'b', 'c'], ['a', 'd'], ['e']]
Practical Example: File System Representation
Building a tree from an actual directory structure on disk:
import os
import json
def directory_to_tree(root_path):
"""Build a tree dictionary from an actual directory structure."""
tree = {}
for dirpath, dirnames, filenames in os.walk(root_path):
rel_path = os.path.relpath(dirpath, root_path)
parts = rel_path.split(os.sep) if rel_path != "." else []
# Navigate to the current position in the tree
current = tree
for part in parts:
current = current.setdefault(part, {})
# Add subdirectories
for dirname in dirnames:
current.setdefault(dirname, {})
# Add files (None indicates a file rather than a directory)
for filename in filenames:
current[filename] = None
return tree
# Usage:
# tree = directory_to_tree("/path/to/directory")
# print(json.dumps(tree, indent=2))
Files are represented as None values to distinguish them from directories, which are represented as dictionaries.
Method Comparison
| Method | Style | Pros | Best For |
|---|---|---|---|
| Iterative pointer | Procedural | Clear, debuggable, easy to extend | Production code |
defaultdict | Recursive | Clean auto-creation, no existence checks | Quick prototyping |
reduce() | Functional | Compact, elegant | Short scripts, one-liners |
Conclusion
Converting a list of paths to a tree dictionary in Python is a versatile technique for representing hierarchical data. The iterative pointer approach is recommended for production code because it is explicit, easy to debug, and simple to extend with features like leaf values, metadata, or validation. The defaultdict approach eliminates the need for existence checks and is great for quick prototyping. The reduce() approach offers the most compact solution for developers comfortable with functional programming patterns.
Use the iterative pointer approach with setdefault() for production code. It is explicit, easy to modify for adding values or metadata at any node, and does not require understanding recursive defaultdict behavior or functional programming concepts. Reserve reduce() for situations where brevity is prioritized over clarity.