Python Pandas: How to Create a Pandas DataFrame from a Dictionary of Lists
Constructing DataFrames from dictionaries is the most common and intuitive way to create tabular data in Pandas. The structure maps naturally: dictionary keys become column headers, and the lists they contain become the data in each column. This pattern appears constantly in data processing, whether you are organizing results from calculations, preparing data for analysis, or converting API responses into structured tables.
In this guide, you will learn how to create DataFrames from dictionaries, customize the index and column order, handle common errors, and control data types.
Basic Conversion
Pass a dictionary of lists directly to the pd.DataFrame() constructor. Each key becomes a column name, and each list becomes the column's data:
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [85, 92, 78],
'Status': ['Pass', 'Pass', 'Fail']
}
df = pd.DataFrame(data)
print(df)
Output:
Name Score Status
0 Alice 85 Pass
1 Bob 92 Pass
2 Charlie 78 Fail
Pandas automatically assigns a numeric index (0, 1, 2) and infers the data type for each column.
Building from Separate List Variables
When your data already exists as separate list variables, you can combine them into a dictionary inline:
import pandas as pd
names = ['Alice', 'Bob', 'Charlie']
scores = [85, 92, 78]
grades = ['B', 'A', 'C']
df = pd.DataFrame({
'Name': names,
'Score': scores,
'Grade': grades
})
print(df)
Output:
Name Score Grade
0 Alice 85 B
1 Bob 92 A
2 Charlie 78 C
This approach is especially common when you have computed each column separately and want to bring them together into a single table.
Adding a Custom Index
By default, Pandas assigns a sequential numeric index starting at 0. You can replace this with meaningful labels using the index parameter:
import pandas as pd
data = {
'City': ['NYC', 'LA', 'Chicago'],
'Population': [8.3, 3.9, 2.7]
}
df = pd.DataFrame(data, index=['ny', 'ca', 'il'])
print(df)
Output:
City Population
ny NYC 8.3
ca LA 3.9
il Chicago 2.7
Custom indices allow you to look up rows by label (e.g., df.loc['ny']) instead of by position, which makes code more readable and less error-prone.
Controlling Column Order
In modern Python (3.7+), dictionaries maintain insertion order, so columns typically appear in the order you define them. If you need a specific order regardless of the dictionary, use the columns parameter:
import pandas as pd
data = {
'B': [1, 2],
'A': [3, 4],
'C': [5, 6]
}
# Without specifying order: columns appear as B, A, C
df_default = pd.DataFrame(data)
print(f"Default order: {df_default.columns.tolist()}")
# With explicit order
df_ordered = pd.DataFrame(data, columns=['A', 'B', 'C'])
print(f"Specified order: {df_ordered.columns.tolist()}")
Output:
Default order: ['B', 'A', 'C']
Specified order: ['A', 'B', 'C']
Handling Unequal List Lengths
All lists in the dictionary must have the same number of elements. If they differ, Pandas raises a ValueError:
import pandas as pd
data = {
'A': [1, 2, 3],
'B': [4, 5] # Only 2 elements instead of 3
}
try:
df = pd.DataFrame(data)
except ValueError as e:
print(f"Error: {e}")
Output:
Error: All arrays must be of the same length
This is one of the most common errors when creating DataFrames from dictionaries. Make sure all your lists are the same length before passing them in.
If your data genuinely has different lengths, pad the shorter lists with None before creating the DataFrame. Pandas will represent these as NaN:
import pandas as pd
data = {
'A': [1, 2, 3],
'B': [4, 5, None] # Padded to match length
}
df = pd.DataFrame(data)
print(df)
Output:
A B
0 1 4.0
1 2 5.0
2 3 NaN
Specifying and Checking Data Types
Pandas infers data types automatically, but you can verify and override them when needed:
import pandas as pd
data = {
'ID': [1, 2, 3],
'Value': [10.5, 20.3, 30.1]
}
df = pd.DataFrame(data)
# Check inferred types
print(df.dtypes)
print()
# Convert types after creation
df = df.astype({'ID': 'str', 'Value': 'float32'})
print(df.dtypes)
Output:
ID int64
Value float64
dtype: object
ID object
Value float32
dtype: object
You can also set the type during creation for uniform columns using the dtype parameter:
import pandas as pd
df = pd.DataFrame({'Values': [1, 2, 3]}, dtype='float64')
print(df.dtypes)
Output:
Values float64
dtype: object
Practical Example: Combining Computed Results
A common real-world pattern is computing several lists of results and combining them into a DataFrame for analysis or export:
import pandas as pd
# Simulated computation results
products = ['Widget', 'Gadget', 'Gizmo']
prices = [29.99, 49.99, 19.99]
quantities = [150, 85, 200]
# Compute derived values
revenues = [p * q for p, q in zip(prices, quantities)]
# Combine everything into a DataFrame
df = pd.DataFrame({
'Product': products,
'Price': prices,
'Quantity': quantities,
'Revenue': revenues
})
print(df)
Output:
Product Price Quantity Revenue
0 Widget 29.99 150 4498.50
1 Gadget 49.99 85 4249.15
2 Gizmo 19.99 200 3998.00
Quick Reference
| Parameter | Purpose | Example |
|---|---|---|
data | Dictionary of lists | {'A': [1, 2], 'B': [3, 4]} |
index | Custom row labels | index=['x', 'y'] |
columns | Column order | columns=['B', 'A'] |
dtype | Force data type | dtype='float64' |
| Rule | Description |
|---|---|
| Equal length | All lists must have the same number of elements |
| Keys as columns | Dictionary keys become column headers |
| Automatic index | Numeric index (0, 1, 2...) unless specified |
| Type inference | Pandas automatically detects data types |
- Creating a DataFrame from a dictionary of lists is the standard pattern for converting Python data into tabular format.
- Each key becomes a column name and each list becomes the column data.
- Ensure all lists have equal length to avoid errors, use
indexfor custom row labels, and usecolumnsto control the ordering when it matters.