Skip to main content

Python Pandas: How to Create a Pandas DataFrame from a Dictionary of Lists

Constructing DataFrames from dictionaries is the most common and intuitive way to create tabular data in Pandas. The structure maps naturally: dictionary keys become column headers, and the lists they contain become the data in each column. This pattern appears constantly in data processing, whether you are organizing results from calculations, preparing data for analysis, or converting API responses into structured tables.

In this guide, you will learn how to create DataFrames from dictionaries, customize the index and column order, handle common errors, and control data types.

Basic Conversion

Pass a dictionary of lists directly to the pd.DataFrame() constructor. Each key becomes a column name, and each list becomes the column's data:

import pandas as pd

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [85, 92, 78],
'Status': ['Pass', 'Pass', 'Fail']
}

df = pd.DataFrame(data)

print(df)

Output:

      Name  Score Status
0 Alice 85 Pass
1 Bob 92 Pass
2 Charlie 78 Fail

Pandas automatically assigns a numeric index (0, 1, 2) and infers the data type for each column.

Building from Separate List Variables

When your data already exists as separate list variables, you can combine them into a dictionary inline:

import pandas as pd

names = ['Alice', 'Bob', 'Charlie']
scores = [85, 92, 78]
grades = ['B', 'A', 'C']

df = pd.DataFrame({
'Name': names,
'Score': scores,
'Grade': grades
})

print(df)

Output:

      Name  Score Grade
0 Alice 85 B
1 Bob 92 A
2 Charlie 78 C

This approach is especially common when you have computed each column separately and want to bring them together into a single table.

Adding a Custom Index

By default, Pandas assigns a sequential numeric index starting at 0. You can replace this with meaningful labels using the index parameter:

import pandas as pd

data = {
'City': ['NYC', 'LA', 'Chicago'],
'Population': [8.3, 3.9, 2.7]
}

df = pd.DataFrame(data, index=['ny', 'ca', 'il'])

print(df)

Output:

       City  Population
ny NYC 8.3
ca LA 3.9
il Chicago 2.7

Custom indices allow you to look up rows by label (e.g., df.loc['ny']) instead of by position, which makes code more readable and less error-prone.

Controlling Column Order

In modern Python (3.7+), dictionaries maintain insertion order, so columns typically appear in the order you define them. If you need a specific order regardless of the dictionary, use the columns parameter:

import pandas as pd

data = {
'B': [1, 2],
'A': [3, 4],
'C': [5, 6]
}

# Without specifying order: columns appear as B, A, C
df_default = pd.DataFrame(data)
print(f"Default order: {df_default.columns.tolist()}")

# With explicit order
df_ordered = pd.DataFrame(data, columns=['A', 'B', 'C'])
print(f"Specified order: {df_ordered.columns.tolist()}")

Output:

Default order: ['B', 'A', 'C']
Specified order: ['A', 'B', 'C']

Handling Unequal List Lengths

All lists in the dictionary must have the same number of elements. If they differ, Pandas raises a ValueError:

import pandas as pd

data = {
'A': [1, 2, 3],
'B': [4, 5] # Only 2 elements instead of 3
}

try:
df = pd.DataFrame(data)
except ValueError as e:
print(f"Error: {e}")

Output:

Error: All arrays must be of the same length

This is one of the most common errors when creating DataFrames from dictionaries. Make sure all your lists are the same length before passing them in.

tip

If your data genuinely has different lengths, pad the shorter lists with None before creating the DataFrame. Pandas will represent these as NaN:

import pandas as pd

data = {
'A': [1, 2, 3],
'B': [4, 5, None] # Padded to match length
}

df = pd.DataFrame(data)
print(df)

Output:

   A    B
0 1 4.0
1 2 5.0
2 3 NaN

Specifying and Checking Data Types

Pandas infers data types automatically, but you can verify and override them when needed:

import pandas as pd

data = {
'ID': [1, 2, 3],
'Value': [10.5, 20.3, 30.1]
}

df = pd.DataFrame(data)

# Check inferred types
print(df.dtypes)
print()

# Convert types after creation
df = df.astype({'ID': 'str', 'Value': 'float32'})
print(df.dtypes)

Output:

ID         int64
Value float64
dtype: object

ID object
Value float32
dtype: object

You can also set the type during creation for uniform columns using the dtype parameter:

import pandas as pd

df = pd.DataFrame({'Values': [1, 2, 3]}, dtype='float64')

print(df.dtypes)

Output:

Values    float64
dtype: object

Practical Example: Combining Computed Results

A common real-world pattern is computing several lists of results and combining them into a DataFrame for analysis or export:

import pandas as pd

# Simulated computation results
products = ['Widget', 'Gadget', 'Gizmo']
prices = [29.99, 49.99, 19.99]
quantities = [150, 85, 200]

# Compute derived values
revenues = [p * q for p, q in zip(prices, quantities)]

# Combine everything into a DataFrame
df = pd.DataFrame({
'Product': products,
'Price': prices,
'Quantity': quantities,
'Revenue': revenues
})

print(df)

Output:

  Product  Price  Quantity  Revenue
0 Widget 29.99 150 4498.50
1 Gadget 49.99 85 4249.15
2 Gizmo 19.99 200 3998.00

Quick Reference

ParameterPurposeExample
dataDictionary of lists{'A': [1, 2], 'B': [3, 4]}
indexCustom row labelsindex=['x', 'y']
columnsColumn ordercolumns=['B', 'A']
dtypeForce data typedtype='float64'
RuleDescription
Equal lengthAll lists must have the same number of elements
Keys as columnsDictionary keys become column headers
Automatic indexNumeric index (0, 1, 2...) unless specified
Type inferencePandas automatically detects data types
  • Creating a DataFrame from a dictionary of lists is the standard pattern for converting Python data into tabular format.
  • Each key becomes a column name and each list becomes the column data.
  • Ensure all lists have equal length to avoid errors, use index for custom row labels, and use columns to control the ordering when it matters.