How to Group a List of Tuples into a Dictionary by Key in Python

Converting a list of key-value tuples into a dictionary where each key maps to a list of associated values is a common data transformation task.

Using defaultdict (Recommended)

The defaultdict from the collections module provides the cleanest and most efficient solution:

from collections import defaultdict

data = [(1, 'A'), (1, 'B'), (2, 'C'), (2, 'D'), (3, 'E')]

grouped = defaultdict(list)
for key, value in data:
    grouped[key].append(value)

print(dict(grouped))
# {1: ['A', 'B'], 2: ['C', 'D'], 3: ['E']}

note

The defaultdict(list) automatically creates an empty list for any new key, eliminating the need for existence checks.

Using setdefault (No Imports)

When avoiding imports, setdefault provides similar functionality:

data = [(1, 'A'), (1, 'B'), (2, 'C'), (2, 'D'), (3, 'E')]

grouped = {}
for key, value in data:
    grouped.setdefault(key, []).append(value)

print(grouped)
# {1: ['A', 'B'], 2: ['C', 'D'], 3: ['E']}

The setdefault method returns the existing value for a key or sets and returns the default if the key does not exist.

Grouping with Sets (Unique Values)

To collect only unique values per key, use sets instead of lists:

from collections import defaultdict

data = [(1, 'A'), (1, 'A'), (1, 'B'), (2, 'C')]

grouped = defaultdict(set)
for key, value in data:
    grouped[key].add(value)

print(dict(grouped))
# {1: {'B', 'A'}, 2: {'C'}}

Grouping Objects by Attribute

Apply the same pattern to group objects:

from collections import defaultdict

class Product:
    def __init__(self, name, category):
        self.name = name
        self.category = category

products = [
    Product("Apple", "Fruit"),
    Product("Banana", "Fruit"),
    Product("Carrot", "Vegetable"),
]

by_category = defaultdict(list)
for product in products:
    by_category[product.category].append(product.name)

print(dict(by_category))
# {'Fruit': ['Apple', 'Banana'], 'Vegetable': ['Carrot']}

Using itertools.groupby

For pre-sorted data, groupby offers a functional approach:

from itertools import groupby

data = [(1, 'A'), (1, 'B'), (2, 'C'), (2, 'D')]

# Data MUST be sorted by key first
data_sorted = sorted(data, key=lambda x: x[0])

grouped = {
    key: [v for _, v in group]
    for key, group in groupby(data_sorted, key=lambda x: x[0])
}

print(grouped)
# {1: ['A', 'B'], 2: ['C', 'D']}

warning

groupby only groups consecutive elements with the same key. Always sort your data first, or you'll get unexpected results:

from itertools import groupby

unsorted = [(1, 'A'), (2, 'B'), (1, 'C')]  # Not sorted

# Wrong result: does not combine all 1s
result = {k: list(g) for k, g in groupby(unsorted, key=lambda x: x[0])}
print(result)
# {1: [(1, 'C')], 2: [(2, 'B')]}  # Missing (1, 'A')!

Avoid the O(N²) Trap

A nested comprehension approach looks elegant but performs terribly:

data = [(1, 'A'), (1, 'B'), (2, 'C')]

# ❌ O(N²): scans entire list for each unique key
grouped = {
    k: [v for k2, v in data if k2 == k]
    for k, _ in data
}
print(grouped)
# {1: ['A', 'B'], 2: ['C']}

This iterates through the entire list for every tuple, making it quadratic complexity.

# ✅ O(N): single pass through the data
from collections import defaultdict

data = [(1, 'A'), (1, 'B'), (2, 'C')]

grouped = defaultdict(list)
for k, v in data:
    grouped[k].append(v)

print(dict(grouped))
# {1: ['A', 'B'], 2: ['C']}

Grouping Multiple Values

When tuples have more than two elements:

from collections import defaultdict

# (category, product, price)
data = [
    ('Fruit', 'Apple', 1.50),
    ('Fruit', 'Banana', 0.75),
    ('Vegetable', 'Carrot', 0.50),
]

# Group by category, keeping product-price pairs
grouped = defaultdict(list)
for category, product, price in data:
    grouped[category].append({'product': product, 'price': price})

print(dict(grouped))
# {'Fruit': [{'product': 'Apple', 'price': 1.5}, 
#            {'product': 'Banana', 'price': 0.75}],
#  'Vegetable': [{'product': 'Carrot', 'price': 0.5}]}

Aggregating Instead of Collecting

Sometimes you want to aggregate values rather than collect them:

from collections import defaultdict

sales = [('Jan', 100), ('Jan', 150), ('Feb', 200), ('Feb', 50)]

# Sum values by key
totals = defaultdict(int)
for month, amount in sales:
    totals[month] += amount

print(dict(totals))
# {'Jan': 250, 'Feb': 250}

Method Comparison

Method	Time Complexity	Imports	Best For
`defaultdict(list)`	O(N)	Yes	General use
`setdefault`	O(N)	No	Import-free code
`groupby`	O(N log N)*	Yes	Pre-sorted data
Nested comprehension	O(N²)	No	Never use

*Includes sorting time

Summary

Use defaultdict(list) as your default approach for grouping; it is efficient, readable, and handles missing keys automatically.
Use setdefault when you want to avoid imports.
Avoid nested comprehensions for grouping as they create O(N²) performance problems.

Using defaultdict (Recommended)​

Using setdefault (No Imports)​

Grouping with Sets (Unique Values)​

Grouping Objects by Attribute​

Using itertools.groupby​

Avoid the O(N²) Trap​

Grouping Multiple Values​

Aggregating Instead of Collecting​

Method Comparison​

Summary​

Table of Contents