How to Get a Subset of a Dictionary in Python
Extracting a subset of a dictionary, selecting only specific key-value pairs based on certain criteria, is a common operation in Python. Whether you're filtering configuration settings, preparing data for an API call, or isolating relevant fields from a large dataset, knowing how to efficiently extract dictionary subsets is essential.
This guide covers multiple approaches, from the most Pythonic to more specialized alternatives, along with techniques for filtering by both keys and values.
Using Dictionary Comprehension (Recommended)
Dictionary comprehension is the most Pythonic and readable way to extract a subset. It creates a new dictionary by iterating over selected keys:
original = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
keys_to_include = {'a', 'c', 'e'}
subset = {k: original[k] for k in keys_to_include}
print(subset)
Output:
{'a': 1, 'e': 5, 'c': 3}
This reads naturally: "build a dictionary of k: original[k] for each k in the keys I want."
If any key in keys_to_include doesn't exist in the original dictionary, you'll get a KeyError:
original = {'a': 1, 'b': 2}
keys_to_include = {'a', 'c'} # 'c' doesn't exist
subset = {k: original[k] for k in keys_to_include}
# KeyError: 'c'
Fix: use if k in original to skip missing keys.
original = {'a': 1, 'b': 2}
keys_to_include = {'a', 'c'} # 'c' doesn't exist
subset = {k: original[k] for k in keys_to_include if k in original}
print(subset)
# Output: {'a': 1}
Or use .get() to provide a default value for missing keys:
original = {'a': 1, 'b': 2}
keys_to_include = {'a', 'c'} # 'c' doesn't exist
subset = {k: original.get(k, None) for k in keys_to_include}
print(subset)
# Output: {'a': 1, 'c': None}
Filtering by Values
Sometimes you need to filter a dictionary based on values rather than keys. Dictionary comprehension handles this elegantly:
Values Above a Threshold
scores = {'Alice': 85, 'Bob': 42, 'Charlie': 91, 'Diana': 67, 'Eve': 73}
# Keep only entries where score is 70 or above
passing = {name: score for name, score in scores.items() if score >= 70}
print(passing)
Output:
{'Alice': 85, 'Charlie': 91, 'Eve': 73}
Excluding Specific Values
data = {'a': 1, 'b': None, 'c': 3, 'd': None, 'e': 5}
# Remove entries with None values
cleaned = {k: v for k, v in data.items() if v is not None}
print(cleaned)
Output:
{'a': 1, 'c': 3, 'e': 5}
Excluding Specific Keys
Instead of specifying which keys to include, sometimes it's easier to specify which keys to exclude:
user_data = {
'name': 'Alice',
'email': 'alice@example.com',
'password': 'secret123',
'age': 30
}
keys_to_exclude = {'password'}
safe_data = {k: v for k, v in user_data.items() if k not in keys_to_exclude}
print(safe_data)
Output:
{'name': 'Alice', 'email': 'alice@example.com', 'age': 30}
Using filter() with lambda
The filter() function provides a functional programming approach. It filters the dictionary's .items() based on a lambda condition:
original = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
keys_to_include = {'a', 'c'}
subset = dict(filter(lambda item: item[0] in keys_to_include, original.items()))
print(subset)
Output:
{'a': 1, 'c': 3}
The lambda receives each (key, value) tuple from .items() and returns True only for items whose key is in keys_to_include.
While filter() works well, dictionary comprehension is generally preferred in Python for its readability. Use filter() when you're chaining functional operations or when the filtering logic is defined as a separate function.
Using operator.itemgetter()
The itemgetter() function from the operator module retrieves multiple values from a dictionary at once. It's efficient when you need to extract values for a known set of keys:
from operator import itemgetter
original = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
keys_to_include = ['a', 'c']
values = itemgetter(*keys_to_include)(original)
# itemgetter returns a single value (not a tuple) when given one key
if len(keys_to_include) == 1:
values = (values,)
subset = dict(zip(keys_to_include, values))
print(subset)
Output:
{'a': 1, 'c': 3}
itemgetter() raises a KeyError if any requested key is missing from the dictionary. Always ensure the keys exist before using this method.
Using map() with dict.get()
Combine map() with dict.get() to extract values for specified keys. The .get() method gracefully handles missing keys by returning None (or a custom default):
original = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
keys_to_include = ['a', 'c', 'x'] # 'x' doesn't exist
subset = dict(zip(keys_to_include, map(original.get, keys_to_include)))
print(subset)
Output:
{'a': 1, 'c': 3, 'x': None}
Unlike itemgetter(), this approach doesn't raise errors for missing keys; it inserts None instead.
Practical Example: Extracting API Response Fields
A real-world use case is extracting specific fields from an API response:
api_response = {
'id': 12345,
'name': 'Alice Johnson',
'email': 'alice@example.com',
'phone': '555-0100',
'address': '123 Main St',
'created_at': '2024-01-15',
'internal_notes': 'VIP customer',
'password_hash': 'abc123...'
}
# Extract only the fields safe to display
public_fields = {'id', 'name', 'email', 'phone', 'address'}
public_data = {k: api_response[k] for k in public_fields if k in api_response}
print("Public data:")
for key, value in public_data.items():
print(f" {key}: {value}")
Output:
Public data:
address: 123 Main St
phone: 555-0100
id: 12345
email: alice@example.com
name: Alice Johnson
Sensitive fields like password_hash and internal_notes are excluded from the result.
Comparison of Approaches
| Method | Handles Missing Keys | Readability | Best For |
|---|---|---|---|
| Dict comprehension | ✅ (with if k in) | ⭐⭐⭐⭐⭐ | Most use cases (recommended) |
| Dict comprehension (values) | ✅ | ⭐⭐⭐⭐⭐ | Filtering by value conditions |
filter() + lambda | ✅ | ⭐⭐⭐ | Functional programming style |
operator.itemgetter() | ❌ (raises KeyError) | ⭐⭐⭐ | Fast extraction of known keys |
map() + dict.get() | ✅ (returns None) | ⭐⭐⭐ | Safe extraction with defaults |
Conclusion
For most situations, dictionary comprehension is the best approach. It's readable, flexible, and handles both key-based and value-based filtering elegantly.
- Add an
if k in originalguard to safely handle missing keys. - For functional programming patterns,
filter()withlambdais a solid alternative. - Use
itemgetter()when performance matters and you're certain all keys exist.
Whichever method you choose, extracting dictionary subsets is a fundamental skill that makes your Python code cleaner and your data handling more precise.