How to Conditionally Join Dictionary Lists in Python
A conditional join on two lists of dictionaries merges dictionaries from both lists when a specified key has the same value in both. This operation is similar to a SQL JOIN on a common column: you compare a "join key" across two datasets and combine matching records.
This technique is useful in data processing pipelines, API response merging, configuration management, and any scenario where you need to combine related records from separate data sources based on a shared field.
In this guide, you will learn multiple ways to perform conditional joins on dictionary lists in Python, from efficient lookup-based approaches to straightforward loop-based methods.
Understanding the Problem
Given two lists of dictionaries and a common key, merge dictionaries at matching positions (or with matching key values) from both lists.
list1 = [
{"name": "Alice", "dept": "Engineering"},
{"name": "Bob", "dept": "Sales"},
{"name": "Carol", "dept": "Marketing"},
]
list2 = [
{"bonus": 5000, "dept": "Engineering"},
{"bonus": 3000, "dept": "HR"},
{"bonus": 4000, "dept": "Marketing"},
]
join_key = "dept"
# Expected result:
# [
# {"name": "Alice", "dept": "Engineering", "bonus": 5000},
# {"name": "Bob", "dept": "Sales"}, # No match in list2
# {"name": "Carol", "dept": "Marketing", "bonus": 4000},
# ]
Dictionaries from list1 are merged with dictionaries from list2 only when the value of join_key matches. Unmatched dictionaries from list1 are kept as-is (similar to a SQL LEFT JOIN).
Method 1: Using a Lookup Dictionary (Recommended)
The most efficient approach pre-builds a lookup dictionary from the second list, keyed by the join key's value. This reduces the join operation from O(n×m) to O(n+m).
def conditional_join(list1, list2, join_key):
"""Merge dictionaries from list1 and list2 where join_key values match."""
# Build a lookup from list2 for O(1) access
lookup = {}
for d in list2:
key_value = d.get(join_key)
if key_value is not None:
lookup[key_value] = d
# Merge matching dictionaries
result = []
for d in list1:
key_value = d.get(join_key)
if key_value in lookup:
# Merge: list1 dict takes priority, then list2 dict fills in extras
merged = {**d, **lookup[key_value]}
result.append(merged)
else:
result.append(d.copy())
return result
list1 = [
{"tutorialreference": 1, "is": 3, "best": 2},
{"tutorialreference": 1, "best": 6},
{"all": 7, "best": 10},
]
list2 = [
{"good": 4, "best": 2},
{"tutorial": 2, "best": 3},
{"CS": 2, "best": 10},
]
result = conditional_join(list1, list2, "best")
print(result)
Output:
[
{'tutorialreference': 1, 'is': 3, 'best': 2, 'good': 4},
{'tutorialreference': 1, 'best': 6},
{'all': 7, 'best': 10, 'CS': 2}
]
How it works:
- A
lookupdictionary maps eachjoin_keyvalue fromlist2to its dictionary, enabling O(1) lookups. - For each dictionary in
list1, check if itsjoin_keyvalue exists in the lookup. - If a match is found, merge the two dictionaries using
{**d, **lookup[key_value]}. - If no match exists, include the original dictionary unchanged.
This approach has O(n + m) time complexity, making it the most efficient method, especially for large lists. It's analogous to a hash join in database systems.
Method 2: Using next() and update()
This approach iterates through list1 and uses next() with a generator expression to find the first matching dictionary in list2. It modifies list1 in place.
list1 = [
{"tutorialreference": 1, "is": 3, "best": 2},
{"tutorialreference": 1, "best": 6},
{"all": 7, "best": 10},
]
list2 = [
{"good": 4, "best": 2},
{"tutorial": 2, "best": 3},
{"CS": 2, "best": 10},
]
join_key = "best"
for d1 in list1:
# Find the first matching dict in list2
match = next(
(d2 for d2 in list2 if d1.get(join_key) == d2.get(join_key)),
None
)
if match:
d1.update(match)
print(list1)
Output:
[
{'tutorialreference': 1, 'is': 3, 'best': 2, 'good': 4},
{'tutorialreference': 1, 'best': 6},
{'all': 7, 'best': 10, 'CS': 2}
]
This method modifies list1 in place using update(). If you need to preserve the original data, work with copies:
import copy
list1_copy = copy.deepcopy(list1)
# ... perform joins on list1_copy
Also, the time complexity is O(n × m) because next() scans list2 for each dictionary in list1. For large lists, prefer the lookup dictionary approach.
Method 3: Using Nested Loops
The most explicit approach uses nested for loops with a for...else construct to handle unmatched dictionaries:
def conditional_join(list1, list2, join_key):
"""Join two dictionary lists using nested loops."""
result = []
for d1 in list1:
for d2 in list2:
if join_key in d1 and join_key in d2 and d1[join_key] == d2[join_key]:
result.append({**d1, **d2})
break
else:
# No matching dictionary found in list2
result.append(d1.copy())
return result
list1 = [
{"tutorialreference": 1, "is": 3, "best": 2},
{"tutorialreference": 1, "best": 6},
{"all": 7, "best": 10},
]
list2 = [
{"good": 4, "best": 2},
{"tutorial": 2, "best": 3},
{"CS": 2, "best": 10},
]
result = conditional_join(list1, list2, "best")
print(result)
Output:
[
{'tutorialreference': 1, 'is': 3, 'best': 2, 'good': 4},
{'tutorialreference': 1, 'best': 6},
{'all': 7, 'best': 10, 'CS': 2}
]
How it works:
- For each dictionary in
list1, iterate through all dictionaries inlist2. - If a match is found on the join key, merge with
{**d1, **d2}andbreak. - The
for...elseconstruct ensures that if no match is found (the loop completes without breaking), the original dictionary is added to the result.
The for...else pattern in Python executes the else block only when the loop completes without hitting a break. This is a clean way to handle the "no match found" case.
Common Mistake: Losing Unmatched Dictionaries
A frequent error is forgetting to include dictionaries from list1 that have no match in list2:
Wrong: drops unmatched dictionaries
def join_wrong(list1, list2, join_key):
result = []
for d1 in list1:
for d2 in list2:
if d1.get(join_key) == d2.get(join_key):
result.append({**d1, **d2})
return result
list1 = [{"name": "Alice", "id": 1}, {"name": "Bob", "id": 2}]
list2 = [{"score": 95, "id": 1}]
print(join_wrong(list1, list2, "id"))
Output (Bob is missing!):
[{'name': 'Alice', 'id': 1, 'score': 95}]
Correct: preserves unmatched dictionaries (LEFT JOIN behavior)
def join_correct(list1, list2, join_key):
lookup = {d.get(join_key): d for d in list2}
result = []
for d1 in list1:
key_val = d1.get(join_key)
if key_val in lookup:
result.append({**d1, **lookup[key_val]})
else:
result.append(d1.copy())
return result
list1 = [{"name": "Alice", "id": 1}, {"name": "Bob", "id": 2}]
list2 = [{"score": 95, "id": 1}]
print(join_correct(list1, list2, "id"))
Output:
[{'name': 'Alice', 'id': 1, 'score': 95}, {'name': 'Bob', 'id': 2}]
Handling Key Conflicts
When both dictionaries contain the same key (other than the join key), you need to decide which value takes priority. With {**d1, **d2}, the second dictionary's values win:
d1 = {"name": "Alice", "score": 90, "dept": "Engineering"}
d2 = {"score": 95, "dept": "Engineering"}
merged = {**d1, **d2}
print(merged)
Output (d2's score (95) overwrites d1's score (90))
{'name': 'Alice', 'score': 95, 'dept': 'Engineering'}
To give list1 priority instead, reverse the order:
d1 = {"name": "Alice", "score": 90, "dept": "Engineering"}
d2 = {"score": 95, "dept": "Engineering"}
merged = {**d2, **d1}
print(merged)
Output (d1's score (90) is preserved)
{'score': 90, 'dept': 'Engineering', 'name': 'Alice'}
Performance Comparison
| Method | Time Complexity | Space Complexity | Preserves Unmatched | In-Place |
|---|---|---|---|---|
| Lookup dictionary | O(n + m) | O(m) for lookup | ✅ Yes | ❌ No |
next() + update() | O(n × m) | O(1) | ✅ Yes | ✅ Yes |
| Nested loops | O(n × m) | O(n) for result | ✅ Yes | ❌ No |
Summary
Conditional joining of dictionary lists is a powerful technique for merging related records from separate data sources. Key takeaways:
- Use a lookup dictionary (Method 1) for the best performance, O(n + m) time complexity, ideal for large datasets.
- Use
next()+update()for concise code when lists are small and in-place modification is acceptable. - Use nested loops when you need maximum control and clarity over the join logic.
- Always handle unmatched dictionaries to implement proper LEFT JOIN behavior.
- Be mindful of key conflicts: decide which dictionary's values should take priority when merging.