Skip to main content

Python Pandas: How to Create a Pandas DataFrame from a List of Dictionaries

Lists of dictionaries are one of the most common data formats in Python. They appear naturally when working with JSON API responses, document databases like MongoDB, configuration files, and any scenario where records are represented as key-value pairs. Pandas handles this format natively, automatically mapping dictionary keys to column headers and filling in missing values when records have inconsistent structures.

In this guide, you will learn how to convert lists of dictionaries into DataFrames, handle missing keys and nested structures, and work with real-world API data.

Direct Conversion

Pass the list of dictionaries directly to pd.DataFrame(). Each dictionary becomes a row, and the keys become column names:

import pandas as pd

data = [
{'Name': 'Alice', 'Age': 25, 'City': 'NYC'},
{'Name': 'Bob', 'Age': 30, 'City': 'LA'},
{'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'}
]

df = pd.DataFrame(data)

print(df)

Output:

      Name  Age     City
0 Alice 25 NYC
1 Bob 30 LA
2 Charlie 35 Chicago

Pandas inspects all the dictionaries, collects the unique keys to form columns, and places each dictionary's values in the corresponding row.

Handling Missing Keys

Real-world data is often inconsistent. Some records may lack certain fields, while others include extra ones. Pandas handles this automatically by filling missing values with NaN:

import pandas as pd

data = [
{'Name': 'Alice', 'Score': 95},
{'Name': 'Bob'}, # Missing Score
{'Name': 'Charlie', 'Score': 78, 'Bonus': 5} # Extra key
]

df = pd.DataFrame(data)

print(df)

Output:

      Name  Score  Bonus
0 Alice 95.0 NaN
1 Bob NaN NaN
2 Charlie 78.0 5.0

The Bonus column exists because one dictionary included it, but Alice and Bob get NaN since their dictionaries did not have that key. Similarly, Bob's Score is NaN because his dictionary had no Score entry. This behavior makes Pandas robust for handling inconsistent data without requiring preprocessing.

Selecting Specific Columns

If your dictionaries contain more fields than you need, use the columns parameter to include only the ones you want:

import pandas as pd

data = [
{'Name': 'Alice', 'Age': 25, 'Score': 95, 'Status': 'Active'},
{'Name': 'Bob', 'Age': 30, 'Score': 87, 'Status': 'Active'}
]

df = pd.DataFrame(data, columns=['Name', 'Score'])

print(df)

Output:

    Name  Score
0 Alice 95
1 Bob 87

The Age and Status fields are silently excluded from the result.

Flattening Nested Dictionaries

When dictionaries contain nested structures, the standard pd.DataFrame() constructor stores the inner dictionaries as single cell values. Use pd.json_normalize() to flatten them into separate columns:

import pandas as pd

data = [
{
'Name': 'Alice',
'Contact': {'Email': 'alice@test.com', 'Phone': '555-1234'}
},
{
'Name': 'Bob',
'Contact': {'Email': 'bob@test.com', 'Phone': '555-5678'}
}
]

df = pd.json_normalize(data)

print(df)

Output:

    Name   Contact.Email Contact.Phone
0 Alice alice@test.com 555-1234
1 Bob bob@test.com 555-5678

The nested Contact dictionary is expanded into separate columns with dot-separated names that reflect the hierarchy.

Handling Deeply Nested Structures

pd.json_normalize() handles multiple levels of nesting automatically:

import pandas as pd

data = [
{
'User': 'Alice',
'Meta': {
'Location': {'City': 'NYC', 'Country': 'USA'},
'Device': 'iPhone'
}
}
]

df = pd.json_normalize(data)

print(df.columns.tolist())
print(df)

Output:

['User', 'Meta.Location.City', 'Meta.Location.Country', 'Meta.Device']
User Meta.Location.City Meta.Location.Country Meta.Device
0 Alice NYC USA iPhone

Every level of nesting is flattened into a dot-separated column name, producing a single flat table regardless of how deep the original structure goes.

tip

Use pd.DataFrame() when your dictionaries are flat (no nesting). Use pd.json_normalize() when they contain nested dictionaries that you want expanded into separate columns. If you want to keep nested data as-is within a single cell, pd.DataFrame() will do that by default.

Practical Example: Processing API Responses

REST APIs commonly return data as JSON arrays, which Python's requests library converts to lists of dictionaries:

import pandas as pd
import requests

response = requests.get('https://api.example.com/users')
data = response.json() # Returns a list of dictionaries

# For flat API responses
df = pd.DataFrame(data)

# For nested API responses
df = pd.json_normalize(data)

Handling Paginated API Data

When an API returns data across multiple pages, collect all records and create the DataFrame once:

import pandas as pd
import requests

all_records = []
page = 1

while True:
response = requests.get(f'https://api.example.com/users?page={page}')
data = response.json()

if not data:
break

all_records.extend(data)
page += 1

df = pd.DataFrame(all_records)
print(f"Loaded {len(df)} records from {page - 1} pages")

What Happens Without json_normalize()

To understand why json_normalize() matters, compare what happens when you pass nested dictionaries to the regular constructor:

import pandas as pd

data = [
{'Name': 'Alice', 'Contact': {'Email': 'alice@test.com'}},
{'Name': 'Bob', 'Contact': {'Email': 'bob@test.com'}}
]

# Without json_normalize: nested dict stored as a single cell value
df_flat = pd.DataFrame(data)
print("pd.DataFrame():")
print(df_flat)
print()

# With json_normalize: nested dict expanded into columns
df_normalized = pd.json_normalize(data)
print("pd.json_normalize():")
print(df_normalized)

Output:

pd.DataFrame():
Name Contact
0 Alice {'Email': 'alice@test.com'}
1 Bob {'Email': 'bob@test.com'}

pd.json_normalize():
Name Contact.Email
0 Alice alice@test.com
1 Bob bob@test.com

With the standard constructor, the entire nested dictionary is stored as a single object in the Contact column, making it difficult to filter or analyze. With json_normalize(), each nested field becomes its own column.

Quick Reference

Data FormatMethodUse Case
Flat dictionariespd.DataFrame(data)Standard records, API data
Nested dictionariespd.json_normalize(data)Complex JSON, nested APIs
Missing keysHandled automaticallyInconsistent records
Subset of columnscolumns=['A', 'B']Selecting specific fields
  • Use pd.DataFrame(data) for flat lists of dictionaries where keys map directly to columns.
  • Use pd.json_normalize() when your dictionaries contain nested structures that need to be flattened into individual columns.
  • Missing keys are automatically filled with NaN, making Pandas robust for handling real-world data where records do not always have the same fields.