Skip to main content

How to Filter Values in a Heterogeneous Dictionary in Python

When working with Python dictionaries, values aren't always of the same type. You might have a dictionary containing a mix of integers, strings, floats, and other types. Filtering such a heterogeneous dictionary - for example, keeping only integer values above a certain threshold while preserving all non-integer values - requires type-aware logic.

This is a common scenario in data cleaning, configuration parsing, and API response processing. In this guide, you'll learn the most effective methods to filter dictionary values based on type-specific conditions.

Understanding the Problem

Consider a dictionary where values are a mix of integers and strings:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}

The goal is to filter out integer values that are less than or equal to a threshold K, while keeping all non-integer values untouched. With K = 3, the expected result is:

{'TutorialReference': 4, 'for': 'you'}
  • 'TutorialReference': 4 → integer, 4 > 3kept
  • 'is': 2 → integer, 2 <= 3removed
  • 'best': 3 → integer, 3 <= 3removed
  • 'for': 'you' → string, not an integer → kept

The most Pythonic and recommended approach uses isinstance() for type checking within a dictionary comprehension.

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}
K = 3

result = {key: val for key, val in data.items()
if not isinstance(val, int) or val > K}

print(result)

Output:

{'TutorialReference': 4, 'for': 'you'}

How it works:

  1. The comprehension iterates over each key-value pair in the dictionary.
  2. For each value, it checks: is this value not an integer, OR is it greater than K?
  3. If either condition is True, the pair is included in the result.

The logic effectively says: "Keep this entry unless it's an integer that fails the threshold check."

Why isinstance() over type()?

isinstance() is preferred because it correctly handles inheritance. For example, isinstance(True, int) returns True (since bool is a subclass of int), which you may or may not want. More importantly, isinstance() is the standard Pythonic way to check types and supports checking against multiple types: isinstance(val, (int, float)).

Using type() with Dictionary Comprehension

If you need a strict type check that excludes subclasses (like bool), use type() instead:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you', 'flag': True}
K = 3

result = {key: val for key, val in data.items()
if type(val) != int or val > K}

print(result)

Output:

{'TutorialReference': 4, 'for': 'you', 'flag': True}

Notice that True (a bool) is kept because type(True) != int evaluates to True - bool and int are different types when checked with type().

isinstance() vs. type() - Key Difference

value = True

print(isinstance(value, int)) # True (bool is a subclass of int)
print(type(value) == int) # False (strict type comparison)

Output:

True
False

Choose based on whether you want subclass-aware checking (isinstance) or strict type matching (type).

Using filter() with a Lambda Function

For a functional programming approach, use filter() with a lambda:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}
K = 3

result = dict(filter(
lambda item: not isinstance(item[1], int) or item[1] > K,
data.items()
))

print(result)

Output:

{'TutorialReference': 4, 'for': 'you'}

How it works:

  1. data.items() produces key-value tuples.
  2. The lambda checks each tuple's value (item[1]) against the filtering condition.
  3. dict() converts the filtered result back into a dictionary.

While this works, the dictionary comprehension approach is generally more readable for this use case.

Using a Standard for Loop

When your filtering logic is more complex or you need to perform additional operations (like logging which entries were removed), a for loop provides the clearest structure:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}
K = 3

result = {}
for key, val in data.items():
if not isinstance(val, int) or val > K:
result[key] = val

print(result)

Output:

{'TutorialReference': 4, 'for': 'you'}

This is functionally identical to the dictionary comprehension approach but makes it easier to add debugging or extra processing steps.

Handling Multiple Numeric Types

In real-world data, you might encounter float, int, or even Decimal values mixed together. Here's how to handle filtering across multiple numeric types:

data = {'price': 9.99, 'name': 'widget', 'stock': 2, 'rating': 4.5, 'id': 'A001'}
K = 3

result = {key: val for key, val in data.items()
if not isinstance(val, (int, float)) or val > K}

print(result)

Output:

{'price': 9.99, 'name': 'widget', 'rating': 4.5, 'id': 'A001'}

By passing a tuple (int, float) to isinstance(), both integer and floating-point values are checked against the threshold.

Creating a Reusable Function

For production code, a reusable function makes your filtering logic testable and self-documenting:

def filter_dict_by_threshold(
data: dict,
threshold: float,
numeric_types: tuple = (int, float)
) -> dict:
"""Keep non-numeric values and numeric values above the threshold."""
return {
key: val for key, val in data.items()
if not isinstance(val, numeric_types) or val > threshold
}


# Usage examples
data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you', 'score': 5.5}

print(filter_dict_by_threshold(data, 3))
print(filter_dict_by_threshold(data, 5))

Output:

{'TutorialReference': 4, 'for': 'you', 'score': 5.5}
{'for': 'you', 'score': 5.5}
Customizing the condition

You can easily modify the function to support different comparisons (e.g., >=, <, ==) by passing a comparison function as a parameter:

def filter_dict(data: dict, condition) -> dict:
"""Filter dictionary items based on a custom condition."""
return {k: v for k, v in data.items() if condition(v)}

# Usage examples
data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you', 'score': 5.5}

# Keep integers > 3 and all non-integers
result = filter_dict(data, lambda v: not isinstance(v, int) or v > 3)

print(result)
# {'TutorialReference': 4, 'for': 'you', 'score': 5.5}

Common Mistake: Forgetting to Handle Non-Numeric Types

A frequent error is applying a numeric comparison to all values without checking the type first, which causes a TypeError.

Wrong approach:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}
K = 3

result = {key: val for key, val in data.items() if val > K}

Output:

TypeError: '>' not supported between instances of 'str' and 'int'

Python can't compare the string 'you' with the integer 3, so the comprehension crashes.

Correct approach - check the type first:

data = {'TutorialReference': 4, 'is': 2, 'best': 3, 'for': 'you'}
K = 3

result = {key: val for key, val in data.items()
if not isinstance(val, int) or val > K}

print(result)

Output:

{'TutorialReference': 4, 'for': 'you'}

Quick Comparison of Methods

MethodReadabilityBest For
isinstance() + dict comprehension⭐⭐⭐ HighMost use cases (recommended)
type() + dict comprehension⭐⭐⭐ HighStrict type matching (exclude subclasses)
filter() + lambda⭐⭐ MediumFunctional programming style
for loop⭐⭐⭐ HighComplex logic or debugging needs

All methods have O(n) time complexity and O(n) space complexity, where n is the number of items in the dictionary.

Conclusion

Filtering values in a heterogeneous dictionary requires type-aware conditions to avoid TypeError exceptions. Here are the key takeaways:

  • Dictionary comprehension with isinstance() is the most Pythonic and recommended approach for most scenarios.
  • Use type() instead of isinstance() only when you need strict type matching that excludes subclasses like bool.
  • filter() with lambda works well in functional programming patterns but is less readable for this use case.
  • Always check the value's type before applying numeric comparisons to prevent runtime errors in mixed-type dictionaries.

For reusable code, wrap your filtering logic in a function with configurable threshold and type parameters.