How to Convert a Matrix to a Dictionary in Python
Converting a matrix (list of lists) to a dictionary is a common data transformation, especially when pivoting tabular data so that each column becomes a dictionary key with its values collected into a list. This operation bridges the gap between row-oriented and column-oriented data representations.
In this guide, you will learn multiple approaches to perform this conversion, from pure Python techniques to Pandas and NumPy solutions, with practical examples for different matrix structures.
Using zip(*matrix) (Pure Python, Recommended)
The most Pythonic approach transposes the matrix using zip(*matrix) and pairs the resulting columns with header names:
matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]
columns = ["ID", "Name", "Score"]
result = {col: list(vals) for col, vals in zip(columns, zip(*matrix))}
print(result)
Output:
{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}
How it works:
zip(*matrix)unpacks the rows and transposes them into columns:(1, 2, 3),('Alice', 'Bob', 'Charlie'),(95, 87, 92)zip(columns, ...)pairs each column name with its corresponding data- The dictionary comprehension builds the final result, converting each tuple of values to a list
Understanding the Transpose Step
matrix = [
[1, 2, 3],
[4, 5, 6]
]
# zip(*matrix) is equivalent to zip([1,2,3], [4,5,6])
transposed = list(zip(*matrix))
print(transposed)
Output:
[(1, 4), (2, 5), (3, 6)]
Each tuple in the result contains all values from one column of the original matrix.
Using Pandas (Data Analysis Standard)
Pandas handles this transformation natively and offers multiple output orientations:
import pandas as pd
matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]
columns = ["ID", "Name", "Score"]
result = pd.DataFrame(matrix, columns=columns).to_dict(orient='list')
print(result)
Output:
{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}
Different Output Orientations
Pandas provides several orientations to match different downstream needs:
import pandas as pd
matrix = [[1, "Alice"], [2, "Bob"]]
df = pd.DataFrame(matrix, columns=["ID", "Name"])
# Column-oriented: lists of values per column
print(df.to_dict(orient='list'))
# Index-oriented: row index maps to row dict
print(df.to_dict(orient='index'))
# Records: list of row dictionaries
print(df.to_dict(orient='records'))
Output:
{'ID': [1, 2], 'Name': ['Alice', 'Bob']}
{0: {'ID': 1, 'Name': 'Alice'}, 1: {'ID': 2, 'Name': 'Bob'}}
[{'ID': 1, 'Name': 'Alice'}, {'ID': 2, 'Name': 'Bob'}]
Using defaultdict for Incremental Building
When data arrives row by row or the dataset is too large to transpose all at once, defaultdict lets you build the dictionary incrementally:
from collections import defaultdict
matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]
headers = ["ID", "Name", "Score"]
data = defaultdict(list)
for row in matrix:
for header, value in zip(headers, row):
data[header].append(value)
print(dict(data))
Output:
{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}
Processing Streaming Data
This approach is particularly useful when rows arrive one at a time from a data stream, file reader, or generator:
from collections import defaultdict
headers = ["X", "Y", "Z"]
data = defaultdict(list)
def process_row(row):
for h, v in zip(headers, row):
data[h].append(v)
# Simulating rows arriving over time
process_row([1, 2, 3])
process_row([4, 5, 6])
process_row([7, 8, 9])
print(dict(data))
Output:
{'X': [1, 4, 7], 'Y': [2, 5, 8], 'Z': [3, 6, 9]}
Converting Rows to Dictionary Records
Instead of a columnar dictionary, you may need each row as an independent dictionary. This is the "records" format commonly used by APIs and databases:
matrix = [
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]
columns = ["id", "name", "score"]
records = [dict(zip(columns, row)) for row in matrix]
print(records)
Output:
[{'id': 1, 'name': 'Alice', 'score': 95}, {'id': 2, 'name': 'Bob', 'score': 87}, {'id': 3, 'name': 'Charlie', 'score': 92}]
Handling a Matrix with a Header Row
When the first row of the matrix contains column names, separate it from the data before converting:
matrix = [
["ID", "Name", "Score"], # Header row
[1, "Alice", 95],
[2, "Bob", 87],
[3, "Charlie", 92]
]
headers = matrix[0]
data = matrix[1:]
# Columnar dictionary
columnar = {col: list(vals) for col, vals in zip(headers, zip(*data))}
print(columnar)
# Or as a list of records
records = [dict(zip(headers, row)) for row in data]
print(records)
Output:
{'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [95, 87, 92]}
[{'ID': 1, 'Name': 'Alice', 'Score': 95}, {'ID': 2, 'Name': 'Bob', 'Score': 87}, {'ID': 3, 'Name': 'Charlie', 'Score': 92}]
Creating a 2D Coordinate Dictionary
For grid-based data, mapping each cell to its (row, col) coordinate provides efficient position-based lookups:
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
coord_dict = {
(row, col): value
for row, values in enumerate(matrix)
for col, value in enumerate(values)
}
print(coord_dict)
print(f"Value at (1, 1): {coord_dict[(1, 1)]}")
print(f"Value at (2, 0): {coord_dict[(2, 0)]}")
Output:
{(0, 0): 1, (0, 1): 2, (0, 2): 3, (1, 0): 4, (1, 1): 5, (1, 2): 6, (2, 0): 7, (2, 1): 8, (2, 2): 9}
Value at (1, 1): 5
Value at (2, 0): 7
Sparse Matrix Representation
For matrices where most values are zero, storing only non-zero entries saves significant memory:
matrix = [
[0, 0, 3],
[0, 5, 0],
[7, 0, 0]
]
sparse = {
(row, col): value
for row, values in enumerate(matrix)
for col, value in enumerate(values)
if value != 0
}
print(sparse)
print(f"Non-zero entries: {len(sparse)} out of {len(matrix) * len(matrix[0])}")
Output:
{(0, 2): 3, (1, 1): 5, (2, 0): 7}
Non-zero entries: 3 out of 9
Using NumPy for Numerical Data
For large numerical matrices, NumPy provides efficient column slicing:
import numpy as np
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
columns = ["A", "B", "C"]
result = {col: matrix[:, i].tolist() for i, col in enumerate(columns)}
print(result)
Output:
{'A': [1, 4, 7], 'B': [2, 5, 8], 'C': [3, 6, 9]}
NumPy column slicing (matrix[:, i]) is significantly faster than Python-level loops for large datasets.
Reverse Operation: Dictionary to Matrix
To convert a columnar dictionary back to a matrix:
columns_dict = {
'ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [95, 87, 92]
}
column_order = ['ID', 'Name', 'Score']
matrix = [list(row) for row in zip(*[columns_dict[col] for col in column_order])]
print(matrix)
Output:
[[1, 'Alice', 95], [2, 'Bob', 87], [3, 'Charlie', 92]]
Specifying column_order explicitly ensures the columns appear in the expected sequence.
Practical Example: CSV-Style Processing
A reusable function for converting CSV-like row data into a column-oriented dictionary:
def csv_to_column_dict(rows, has_header=True):
"""Convert CSV-like rows to a column dictionary."""
if not rows:
return {}
if has_header:
headers = rows[0]
data = rows[1:]
else:
headers = [f"col_{i}" for i in range(len(rows[0]))]
data = rows
return {
header: [row[i] for row in data]
for i, header in enumerate(headers)
}
csv_data = [
["product", "price", "stock"],
["Apple", 1.50, 100],
["Banana", 0.75, 150],
["Orange", 2.00, 80]
]
result = csv_to_column_dict(csv_data)
print(result)
Output:
{'product': ['Apple', 'Banana', 'Orange'], 'price': [1.5, 0.75, 2.0], 'stock': [100, 150, 80]}
Quick Reference
| Method | Syntax | Best For |
|---|---|---|
zip(*matrix) | {col: list(vals) for col, vals in zip(cols, zip(*m))} | Quick scripts, no dependencies |
| Pandas | pd.DataFrame(m, columns=cols).to_dict(orient='list') | Data analysis pipelines |
defaultdict | Loop with defaultdict(list) | Streaming or incremental data |
| List comprehension | [dict(zip(headers, row)) for row in data] | Row-wise records |
| NumPy | {col: matrix[:, i].tolist() ...} | Large numerical matrices |
Conclusion
Converting a matrix to a dictionary in Python depends on the output format you need and the tools available in your environment. Use zip(*matrix) for the most Pythonic, dependency-free approach to transposing and converting matrices. Use Pandas when the data will undergo further analysis or when you need flexible output orientations like records or index-keyed dictionaries. Use defaultdict for streaming scenarios where rows arrive incrementally. And for large numerical datasets, NumPy provides the most efficient column extraction.
Use zip(*matrix) for quick, dependency-free scripts. It is the most Pythonic way to transpose and convert matrices in a single expression. Use Pandas when the data will undergo further analysis or when you need multiple output orientations from the same data. Use defaultdict for streaming scenarios where data arrives row by row.