Skip to main content

How to Convert a Matrix to a Dictionary in Python

Converting a matrix (list of lists) to a dictionary is a common data transformation, especially when pivoting tabular data so that each column becomes a dictionary key with its values collected into a list. This operation bridges the gap between row-oriented and column-oriented data representations.

In this guide, you will learn multiple approaches to perform this conversion, from pure Python techniques to Pandas and NumPy solutions, with practical examples for different matrix structures.

The most Pythonic approach transposes the matrix using zip(*matrix) and pairs the resulting columns with header names:

matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]

columns = ["ID", "Name", "Score"]

result = {col: list(vals) for col, vals in zip(columns, zip(*matrix))}

print(result)

Output:

{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}

How it works:

  1. zip(*matrix) unpacks the rows and transposes them into columns: (1, 2, 3), ('Alice', 'Bob', 'Charlie'), (95, 87, 92)
  2. zip(columns, ...) pairs each column name with its corresponding data
  3. The dictionary comprehension builds the final result, converting each tuple of values to a list

Understanding the Transpose Step

matrix = [
[1, 2, 3],
[4, 5, 6]
]

# zip(*matrix) is equivalent to zip([1,2,3], [4,5,6])
transposed = list(zip(*matrix))
print(transposed)

Output:

[(1, 4), (2, 5), (3, 6)]

Each tuple in the result contains all values from one column of the original matrix.

Using Pandas (Data Analysis Standard)

Pandas handles this transformation natively and offers multiple output orientations:

import pandas as pd

matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]

columns = ["ID", "Name", "Score"]

result = pd.DataFrame(matrix, columns=columns).to_dict(orient='list')

print(result)

Output:

{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}

Different Output Orientations

Pandas provides several orientations to match different downstream needs:

import pandas as pd

matrix = [[1, "Alice"], [2, "Bob"]]
df = pd.DataFrame(matrix, columns=["ID", "Name"])

# Column-oriented: lists of values per column
print(df.to_dict(orient='list'))

# Index-oriented: row index maps to row dict
print(df.to_dict(orient='index'))

# Records: list of row dictionaries
print(df.to_dict(orient='records'))

Output:

{'ID': [1, 2], 'Name': ['Alice', 'Bob']}
{0: {'ID': 1, 'Name': 'Alice'}, 1: {'ID': 2, 'Name': 'Bob'}}
[{'ID': 1, 'Name': 'Alice'}, {'ID': 2, 'Name': 'Bob'}]

Using defaultdict for Incremental Building

When data arrives row by row or the dataset is too large to transpose all at once, defaultdict lets you build the dictionary incrementally:

from collections import defaultdict

matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]

headers = ["ID", "Name", "Score"]

data = defaultdict(list)

for row in matrix:
for header, value in zip(headers, row):
data[header].append(value)

print(dict(data))

Output:

{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}

Processing Streaming Data

This approach is particularly useful when rows arrive one at a time from a data stream, file reader, or generator:

from collections import defaultdict

headers = ["X", "Y", "Z"]
data = defaultdict(list)

def process_row(row):
for h, v in zip(headers, row):
data[h].append(v)

# Simulating rows arriving over time
process_row([1, 2, 3])
process_row([4, 5, 6])
process_row([7, 8, 9])

print(dict(data))

Output:

{'X': [1, 4, 7], 'Y': [2, 5, 8], 'Z': [3, 6, 9]}

Converting Rows to Dictionary Records

Instead of a columnar dictionary, you may need each row as an independent dictionary. This is the "records" format commonly used by APIs and databases:

matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]

columns = ["id", "name", "score"]

records = [dict(zip(columns, row)) for row in matrix]

print(records)

Output:

[{'id': 1, 'name': 'Alice', 'score': 95}, {'id': 2, 'name': 'Bob', 'score': 87}, {'id': 3, 'name': 'Charlie', 'score': 92}]

Handling a Matrix with a Header Row

When the first row of the matrix contains column names, separate it from the data before converting:

matrix = [
["ID", "Name", "Score"], # Header row
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]

headers = matrix[0]
data = matrix[1:]

# Columnar dictionary
columnar = {col: list(vals) for col, vals in zip(headers, zip(*data))}
print(columnar)

# Or as a list of records
records = [dict(zip(headers, row)) for row in data]
print(records)

Output:

{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}
[{'ID': 1, 'Name': 'Alice', 'Score': 95}, {'ID': 2, 'Name': 'Bob', 'Score': 87}, {'ID': 3, 'Name': 'Charlie', 'Score': 92}]

Creating a 2D Coordinate Dictionary

For grid-based data, mapping each cell to its (row, col) coordinate provides efficient position-based lookups:

matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]

coord_dict = {
(row, col): value
for row, values in enumerate(matrix)
for col, value in enumerate(values)
}

print(coord_dict)
print(f"Value at (1, 1): {coord_dict[(1, 1)]}")
print(f"Value at (2, 0): {coord_dict[(2, 0)]}")

Output:

{(0, 0): 1, (0, 1): 2, (0, 2): 3, (1, 0): 4, (1, 1): 5, (1, 2): 6, (2, 0): 7, (2, 1): 8, (2, 2): 9}
Value at (1, 1): 5
Value at (2, 0): 7

Sparse Matrix Representation

For matrices where most values are zero, storing only non-zero entries saves significant memory:

matrix = [
[0, 0, 3],
[0, 5, 0],
[7, 0, 0]
]

sparse = {
(row, col): value
for row, values in enumerate(matrix)
for col, value in enumerate(values)
if value != 0
}

print(sparse)
print(f"Non-zero entries: {len(sparse)} out of {len(matrix) * len(matrix[0])}")

Output:

{(0, 2): 3, (1, 1): 5, (2, 0): 7}
Non-zero entries: 3 out of 9

Using NumPy for Numerical Data

For large numerical matrices, NumPy provides efficient column slicing:

import numpy as np

matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])

columns = ["A", "B", "C"]

result = {col: matrix[:, i].tolist() for i, col in enumerate(columns)}

print(result)

Output:

{'A': [1, 4, 7], 'B': [2, 5, 8], 'C': [3, 6, 9]}

NumPy column slicing (matrix[:, i]) is significantly faster than Python-level loops for large datasets.

Reverse Operation: Dictionary to Matrix

To convert a columnar dictionary back to a matrix:

columns_dict = {
'ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [95, 87, 92]
}

column_order = ['ID', 'Name', 'Score']

matrix = [list(row) for row in zip(*[columns_dict[col] for col in column_order])]

print(matrix)

Output:

[[1, 'Alice', 95], [2, 'Bob', 87], [3, 'Charlie', 92]]

Specifying column_order explicitly ensures the columns appear in the expected sequence.

Practical Example: CSV-Style Processing

A reusable function for converting CSV-like row data into a column-oriented dictionary:

def csv_to_column_dict(rows, has_header=True):
"""Convert CSV-like rows to a column dictionary."""
if not rows:
return {}

if has_header:
headers = rows[0]
data = rows[1:]
else:
headers = [f"col_{i}" for i in range(len(rows[0]))]
data = rows

return {
header: [row[i] for row in data]
for i, header in enumerate(headers)
}

csv_data = [
["product", "price", "stock"],
["Apple", 1.50, 100],
["Banana", 0.75, 150],
["Orange", 2.00, 80]
]

result = csv_to_column_dict(csv_data)
print(result)

Output:

{'product': ['Apple', 'Banana', 'Orange'], 'price': [1.5, 0.75, 2.0], 'stock': [100, 150, 80]}

Quick Reference

MethodSyntaxBest For
zip(*matrix){col: list(vals) for col, vals in zip(cols, zip(*m))}Quick scripts, no dependencies
Pandaspd.DataFrame(m, columns=cols).to_dict(orient='list')Data analysis pipelines
defaultdictLoop with defaultdict(list)Streaming or incremental data
List comprehension[dict(zip(headers, row)) for row in data]Row-wise records
NumPy{col: matrix[:, i].tolist() ...}Large numerical matrices

Conclusion

Converting a matrix to a dictionary in Python depends on the output format you need and the tools available in your environment. Use zip(*matrix) for the most Pythonic, dependency-free approach to transposing and converting matrices. Use Pandas when the data will undergo further analysis or when you need flexible output orientations like records or index-keyed dictionaries. Use defaultdict for streaming scenarios where rows arrive incrementally. And for large numerical datasets, NumPy provides the most efficient column extraction.

Best Practice

Use zip(*matrix) for quick, dependency-free scripts. It is the most Pythonic way to transpose and convert matrices in a single expression. Use Pandas when the data will undergo further analysis or when you need multiple output orientations from the same data. Use defaultdict for streaming scenarios where data arrives row by row.