How to Convert a List of Dicts to a Dict of Lists in Python

Converting from "records format" (a list of dictionaries) to "columnar format" (a dictionary of lists) is a common data transformation in Python. This restructuring is useful for plotting, database operations, analytical processing, and preparing data for libraries that expect column-oriented input.

For example, given a list of records like [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}], the goal is to produce {'name': ['Alice', 'Bob'], 'age': [25, 30]}.

In this guide, you will learn multiple approaches to perform this transformation, from Pandas one-liners to pure Python solutions, along with strategies for handling missing keys and preserving key order.

Using Pandas (Recommended for Data Analysis)

Pandas handles this transformation natively and automatically manages missing values across records:

import pandas as pd

records = [
    {'name': 'Alice', 'age': 25, 'city': 'NYC'},
    {'name': 'Bob', 'age': 30, 'city': 'LA'},
    {'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
]

columnar = pd.DataFrame(records).to_dict(orient='list')

print(columnar)

Output:

{'name': ['Alice', 'Bob', 'Charlie'], 'age': [25, 30, 35], 'city': ['NYC', 'LA', 'Chicago']}

Handling Missing Keys with Pandas

When dictionaries have inconsistent keys, Pandas fills missing values with NaN automatically:

import pandas as pd

records = [
    {'id': 1, 'value': 100},
    {'id': 2},
    {'id': 3, 'value': 300, 'extra': 'data'}
]

columnar = pd.DataFrame(records).to_dict(orient='list')
print(columnar)

Output:

{'id': [1, 2, 3], 'value': [100.0, nan, 300.0], 'extra': [nan, nan, 'data']}

This automatic NaN filling ensures that all lists have equal length, which is important for downstream processing.

Using `defaultdict` (Pure Python)

For lightweight scripts or environments where Pandas is not available, defaultdict from the collections module provides a clean, dependency-free solution:

from collections import defaultdict

records = [
    {'time': 1, 'temp': 20, 'humidity': 45},
    {'time': 2, 'temp': 22, 'humidity': 50},
    {'time': 3, 'temp': 21, 'humidity': 48}
]

columns = defaultdict(list)

for row in records:
    for key, value in row.items():
        columns[key].append(value)

result = dict(columns)
print(result)

Output:

{'time': [1, 2, 3], 'temp': [20, 22, 21], 'humidity': [45, 50, 48]}

Reusable Function

Wrapping this logic in a function keeps your code DRY when performing this transformation in multiple places:

from collections import defaultdict

def records_to_columns(records):
    """Convert a list of dicts to a dict of lists."""
    columns = defaultdict(list)
    for row in records:
        for key, value in row.items():
            columns[key].append(value)
    return dict(columns)

data = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
print(records_to_columns(data))

Output:

{'a': [1, 3], 'b': [2, 4]}

Unequal List Lengths with Missing Keys

When dictionaries have different sets of keys, the basic defaultdict approach produces lists of unequal lengths because it only appends values for keys that exist in each record:

from collections import defaultdict

records = [
    {'id': 1, 'name': 'Alice'},
    {'id': 2, 'name': 'Bob', 'score': 95},
]

columns = defaultdict(list)
for row in records:
    for key, value in row.items():
        columns[key].append(value)

print(dict(columns))
# {'id': [1, 2], 'name': ['Alice', 'Bob'], 'score': [95]}
# Note: 'score' has only 1 element while others have 2

This can cause index alignment issues in downstream code. See the next section for a safer approach.

Handling Missing Keys with Equal-Length Lists

When dictionaries have inconsistent keys, you should fill missing values explicitly to ensure all lists have the same length:

def records_to_columns_safe(records, fill_value=None):
    """Convert with handling for missing keys, ensuring equal-length lists."""
    if not records:
        return {}

    # Collect all possible keys across all records
    all_keys = set()
    for row in records:
        all_keys.update(row.keys())

    # Build lists using .get() with a fill value for missing keys
    return {
        key: [row.get(key, fill_value) for row in records]
        for key in all_keys
    }

records = [
    {'id': 1, 'name': 'Alice'},
    {'id': 2, 'name': 'Bob', 'score': 95},
    {'id': 3, 'score': 88}
]

result = records_to_columns_safe(records)
print(result)

Output:

{'score': [None, 95, 88], 'id': [1, 2, 3], 'name': ['Alice', 'Bob', None]}

You can also specify a custom fill value:

def records_to_columns_safe(records, fill_value=None):
    """Convert with handling for missing keys, ensuring equal-length lists."""
    if not records:
        return {}

    # Collect all possible keys across all records
    all_keys = set()
    for row in records:
        all_keys.update(row.keys())

    # Build lists using .get() with a fill value for missing keys
    return {
        key: [row.get(key, fill_value) for row in records]
        for key in all_keys
    }

records = [
    {'id': 1, 'name': 'Alice'},
    {'id': 2, 'name': 'Bob', 'score': 95},
    {'id': 3, 'score': 88}
]

result = records_to_columns_safe(records, fill_value=0)
print(result['score'])

Output:

[0, 95, 88]

Using Dictionary Comprehension

For records with consistent keys, a concise dictionary comprehension handles the conversion in a single expression:

records = [
    {'x': 1, 'y': 10},
    {'x': 2, 'y': 20},
    {'x': 3, 'y': 30}
]

# Assumes all dicts have the same keys
keys = records[0].keys()
columnar = {key: [row[key] for row in records] for key in keys}

print(columnar)

Output:

{'x': [1, 2, 3], 'y': [10, 20, 30]}

note

This approach raises a KeyError if any record is missing a key that exists in the first record. Use it only when you are certain all dictionaries share the same set of keys.

Preserving Key Order

When key order matters, such as when the output will be written to a CSV or displayed in a table, you can establish the order from the first record and append any additional keys found in subsequent records:

def records_to_columns_ordered(records):
    """Convert while preserving key order from the first record."""
    if not records:
        return {}

    # Establish key order from the first record
    ordered_keys = list(records[0].keys())

    # Add any additional keys from remaining records
    seen = set(ordered_keys)
    for row in records[1:]:
        for key in row.keys():
            if key not in seen:
                ordered_keys.append(key)
                seen.add(key)

    return {
        key: [row.get(key) for row in records]
        for key in ordered_keys
    }

records = [
    {'name': 'Alice', 'age': 25},
    {'name': 'Bob', 'age': 30, 'city': 'NYC'}
]

print(records_to_columns_ordered(records))

Output:

{'name': ['Alice', 'Bob'], 'age': [25, 30], 'city': [None, 'NYC']}

The keys appear in the order name, age, city, matching the order they were first encountered.

Reverse Operation: Columns Back to Records

To convert a dictionary of lists back to a list of dictionaries, iterate over the indices and build a dictionary for each position:

def columns_to_records(columns):
    """Convert a dict of lists back to a list of dicts."""
    keys = list(columns.keys())
    length = len(columns[keys[0]])

    return [
        {key: columns[key][i] for key in keys}
        for i in range(length)
    ]

columnar = {'name': ['Alice', 'Bob'], 'age': [25, 30]}
records = columns_to_records(columnar)
print(records)

Output:

[{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}]

Performance Comparison

import timeit
from collections import defaultdict
import pandas as pd

records = [{'a': i, 'b': i * 2, 'c': i * 3} for i in range(10_000)]

def with_pandas():
    return pd.DataFrame(records).to_dict(orient='list')

def with_defaultdict():
    columns = defaultdict(list)
    for row in records:
        for k, v in row.items():
            columns[k].append(v)
    return dict(columns)

def with_comprehension():
    keys = records[0].keys()
    return {k: [r[k] for r in records] for k in keys}

print(f"Pandas:         {timeit.timeit(with_pandas, number=100):.4f}s")
print(f"defaultdict:    {timeit.timeit(with_defaultdict, number=100):.4f}s")
print(f"Comprehension:  {timeit.timeit(with_comprehension, number=100):.4f}s")

Typical output (10,000 records, 100 iterations):

Pandas:         2.5150s
defaultdict:    0.3665s
Comprehension:  0.1171s

note

The pure Python approaches are significantly faster for this specific transformation because they avoid the overhead of creating a full DataFrame. However, Pandas becomes worthwhile when you need additional data processing after the conversion.

Practical Example: Preparing Data for Plotting

A common real-world use case is converting sensor readings from record format into columnar format for charting:

from collections import defaultdict

# Sensor readings stored as individual records
readings = [
    {'timestamp': '10:00', 'temp': 20, 'humidity': 45},
    {'timestamp': '10:15', 'temp': 22, 'humidity': 48},
    {'timestamp': '10:30', 'temp': 24, 'humidity': 52},
    {'timestamp': '10:45', 'temp': 23, 'humidity': 50},
]

# Convert to columnar format for plotting
columns = defaultdict(list)
for row in readings:
    for k, v in row.items():
        columns[k].append(v)

# Columnar format maps directly to plot axes
print(f"Timestamps:  {columns['timestamp']}")
print(f"Temperature: {columns['temp']}")
print(f"Humidity:    {columns['humidity']}")

# Ready for matplotlib:
# plt.plot(columns['timestamp'], columns['temp'])

Output:

Timestamps:  ['10:00', '10:15', '10:30', '10:45']
Temperature: [20, 22, 24, 23]
Humidity:    [45, 48, 52, 50]

Method Comparison

Method	Dependencies	Missing Keys	Performance	Best For
Pandas	`pandas`	Auto-fills with `NaN`	Moderate	Data analysis workflows
`defaultdict`	None	Manual handling needed	Fast	General-purpose scripts
Dict comprehension	None	Requires consistent keys	Fastest	Simple, uniform data

Conclusion

Converting a list of dictionaries to a dictionary of lists is a straightforward transformation with several approaches available in Python. Use Pandas when you are already in a data analysis workflow or need automatic handling of missing values with NaN. Use defaultdict for general-purpose Python scripts that need to avoid external dependencies. Use dictionary comprehension for the fastest and most concise solution when all records are guaranteed to share the same keys.

Best Practice

When records may have inconsistent keys, always use an approach that fills missing values (such as records_to_columns_safe() or Pandas) to ensure all output lists have equal length. Unequal-length lists are a common source of subtle bugs in downstream code that assumes columnar alignment.

Using Pandas (Recommended for Data Analysis)​

Handling Missing Keys with Pandas​

Using defaultdict (Pure Python)​

Reusable Function​

Handling Missing Keys with Equal-Length Lists​

Using Dictionary Comprehension​

Preserving Key Order​

Reverse Operation: Columns Back to Records​

Performance Comparison​

Practical Example: Preparing Data for Plotting​

Method Comparison​

Conclusion​

Table of Contents

Using Pandas (Recommended for Data Analysis)

Handling Missing Keys with Pandas

Using `defaultdict` (Pure Python)

Reusable Function

Handling Missing Keys with Equal-Length Lists

Using Dictionary Comprehension

Preserving Key Order

Reverse Operation: Columns Back to Records

Performance Comparison

Practical Example: Preparing Data for Plotting

Method Comparison

Conclusion