Skip to main content

How to Get Unique Values from a List of Dictionaries in Python

When working with lists of dictionaries, a common data structure for representing records, API responses, or database rows, you'll often need to extract all unique values across the dictionaries. This is useful for identifying distinct entries, building filter options, or deduplicating data.

For example, given:

data = [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}, {'name': 'Alice', 'age': 22}]

The unique values across all dictionaries are: {'Alice', 'Bob', 25, 30, 22}.

This guide covers several approaches, from extracting all unique values to targeting specific keys.

The most concise and efficient way to collect unique values is combining a set with a nested generator expression:

data = [
{'name': 'Alice', 'age': 25},
{'name': 'Bob', 'age': 30},
{'name': 'Alice', 'age': 22}
]

unique_values = set(val for d in data for val in d.values())
print(unique_values)

Output:

{'Alice', 'Bob', 22, 25, 30}

The generator expression iterates over each dictionary in data, then over each value in that dictionary. The set automatically discards duplicates.

note

Sets are unordered, so the output order is not guaranteed. If you need to preserve the order of first appearance, see the dict.fromkeys() approach below.

Getting Unique Values for a Specific Key​

More commonly, you need unique values from a specific key rather than all values across all keys. For example, getting all unique names:

data = [
{'name': 'Alice', 'age': 25, 'city': 'NYC'},
{'name': 'Bob', 'age': 30, 'city': 'LA'},
{'name': 'Alice', 'age': 22, 'city': 'Chicago'},
{'name': 'Charlie', 'age': 30, 'city': 'NYC'}
]

# Unique names
unique_names = set(d['name'] for d in data)
print("Unique names:", unique_names)

# Unique cities
unique_cities = set(d['city'] for d in data)
print("Unique cities:", unique_cities)

# Unique ages
unique_ages = set(d['age'] for d in data)
print("Unique ages:", unique_ages)

Output:

Unique names: {'Charlie', 'Alice', 'Bob'}
Unique cities: {'Chicago', 'NYC', 'LA'}
Unique ages: {25, 30, 22}
Handling missing keys safely

If some dictionaries might not contain the target key, use .get() to avoid KeyError:

data = [
{'name': 'Alice', 'age': 25},
{'name': 'Bob'}, # No 'age' key
{'name': 'Charlie', 'age': 30}
]

unique_ages = set(d.get('age') for d in data if 'age' in d)
print("Unique ages:", unique_ages)

Output

{25, 30}

Using dict.fromkeys() to Preserve Order​

If you need unique values in the order they first appear, dict.fromkeys() is the best approach (Python 3.7+):

data = [
{'name': 'Alice', 'age': 25},
{'name': 'Bob', 'age': 30},
{'name': 'Alice', 'age': 22}
]

unique_values = list(dict.fromkeys(val for d in data for val in d.values()))
print(unique_values)

Output:

['Alice', 25, 'Bob', 30, 22]

Since dictionary keys are unique and maintain insertion order in Python 3.7+, this approach deduplicates while preserving the order of first occurrence.

Using itertools.chain.from_iterable()​

For large datasets, itertools.chain.from_iterable() efficiently flattens the values from all dictionaries into a single iterator without creating intermediate lists:

from itertools import chain

data = [
{'name': 'Alice', 'age': 25},
{'name': 'Bob', 'age': 30},
{'name': 'Alice', 'age': 22}
]

unique_values = set(chain.from_iterable(d.values() for d in data))
print(unique_values)

Output:

{'Alice', 'Bob', 22, 25, 30}
note

chain.from_iterable() lazily iterates through each dictionary's values without building an intermediate collection, making it memory-efficient for very large datasets.

Using a Loop for Custom Logic​

A manual loop gives you the most flexibility to add custom filtering or transformation during the deduplication:

data = [
{'name': 'Alice', 'age': 25},
{'name': 'Bob', 'age': 30},
{'name': 'Alice', 'age': 22}
]

unique_values = []
seen = set()

for d in data:
for val in d.values():
if val not in seen:
seen.add(val)
unique_values.append(val)

print(unique_values)

Output:

['Alice', 25, 'Bob', 30, 22]
note

This approach preserves order and runs in O(n) time thanks to the seen set for fast lookups.

Avoid checking membership against a list

Without the seen set, checking if val not in unique_values against a list is O(n) per check, making the overall approach O(n²):

# Slower approach: O(n²)
unique_values = []
for d in data:
for val in d.values():
if val not in unique_values: # O(n) lookup on each iteration
unique_values.append(val)

Always use a set for membership checks alongside the list to maintain O(n) performance.

Practical Example: Extracting Unique Values per Key​

A real-world scenario is building a summary of all unique values for each key across a list of records:

data = [
{'product': 'Laptop', 'brand': 'Dell', 'category': 'Electronics'},
{'product': 'Phone', 'brand': 'Apple', 'category': 'Electronics'},
{'product': 'Shirt', 'brand': 'Nike', 'category': 'Clothing'},
{'product': 'Laptop', 'brand': 'HP', 'category': 'Electronics'},
{'product': 'Shoes', 'brand': 'Nike', 'category': 'Clothing'}
]

# Get unique values for each key
summary = {}
for key in data[0].keys():
summary[key] = list(dict.fromkeys(d[key] for d in data))

for key, values in summary.items():
print(f"{key}: {values}")

Output:

product: ['Laptop', 'Phone', 'Shirt', 'Shoes']
brand: ['Dell', 'Apple', 'Nike', 'HP']
category: ['Electronics', 'Clothing']

Comparison of Approaches​

MethodPreserves OrderTime ComplexityMemory EfficientBest For
set() + generatorāŒO(n)āœ…Fast deduplication (recommended)
dict.fromkeys()āœ…O(n)āœ…Order-preserving deduplication
itertools.chain + setāŒO(n)āœ…Large datasets
Loop with seen setāœ…O(n)āœ…Custom logic during dedup
Loop (list-only check)āœ…O(n²)āœ…Small datasets, simplicity

Conclusion​

For extracting unique values from a list of dictionaries, a set with a generator expression is the fastest and most concise approach.

  • Use dict.fromkeys() when you need to preserve the order of first appearance. For specific keys, target them directly in the generator expression.
  • For large datasets, itertools.chain.from_iterable() provides memory-efficient iteration. Choose the method that best matches your needs for ordering, performance, and flexibility.