Python Pandas: How to Load a JSON String Into a Pandas DataFrame in Python

JSON (JavaScript Object Notation) is one of the most common data interchange formats used in web APIs, configuration files, and data storage. When working with data analysis in Python, you frequently need to convert JSON data - whether from a file, a string, or an API response - into a Pandas DataFrame for efficient manipulation and analysis.

Pandas provides built-in functions like read_json() and json_normalize() that make this conversion straightforward.

In this guide, you will learn how to load JSON strings and files into DataFrames, handle different JSON orientations, and work with nested JSON structures.

Loading a JSON String Directly Into a DataFrame

The most common scenario is converting a JSON-formatted string into a DataFrame using pd.read_json():

import pandas as pd

json_string = '''
[
    {"Name": "Alice", "Age": 30, "City": "New York"},
    {"Name": "Bob", "Age": 25, "City": "Chicago"},
    {"Name": "Charlie", "Age": 35, "City": "Houston"}
]
'''

df = pd.read_json(json_string)
print(df)

Output:

      Name  Age      City
  Alice   30  New York
    Bob   25   Chicago
Charlie   35   Houston

The function automatically detects the JSON structure and maps keys to column names and values to rows.

Loading a JSON File Into a DataFrame

To load JSON data from a file, pass the file path directly to pd.read_json():

import pandas as pd

df = pd.read_json("data.json")
print(df)

This reads the entire file, parses the JSON content, and returns a DataFrame. No manual file opening or parsing is required.

tip

pd.read_json() also accepts URLs, so you can load JSON data directly from a web API:

df = pd.read_json("https://api.example.com/data.json")

Understanding JSON Orientations

JSON data can be structured in several different ways, and Pandas needs to know the orientation to parse it correctly. The orient parameter in read_json() controls how the JSON structure is interpreted.

Records Orientation (Array of Objects)

This is the most common format - an array where each element is an object representing a row:

[
    {"Name": "Alice", "Age": 30, "City": "New York"},
    {"Name": "Bob", "Age": 25, "City": "Chicago"}
]

import pandas as pd

json_string = '[{"Name": "Alice", "Age": 30}, {"Name": "Bob", "Age": 25}]'
df = pd.read_json(json_string, orient="records")
print(df)

Output:

    Name  Age
0  Alice   30
1    Bob   25

Index Orientation

Each top-level key is a row index, and its value is an object of column-value pairs:

{
    "0": {"Name": "Alice", "Age": 30},
    "1": {"Name": "Bob", "Age": 25}
}

import pandas as pd

json_string = '{"0": {"Name": "Alice", "Age": 30}, "1": {"Name": "Bob", "Age": 25}}'
df = pd.read_json(json_string, orient="index")
print(df)

Output:

    Name  Age
0  Alice   30
1    Bob   25

Column Orientation

Each top-level key is a column name, and its value is an object mapping row indices to values:

{
    "Name": {"0": "Alice", "1": "Bob"},
    "Age": {"0": 30, "1": 25}
}

import pandas as pd

json_string = '{"Name": {"0": "Alice", "1": "Bob"}, "Age": {"0": 30, "1": 25}}'
df = pd.read_json(json_string, orient="columns")
print(df)

Output:

    Name  Age
0  Alice   30
1    Bob   25

Values Orientation

A simple 2D array with no column names or indices:

[
    ["Alice", 30, "New York"],
    ["Bob", 25, "Chicago"]
]

import pandas as pd

json_string = '[["Alice", 30, "New York"], ["Bob", 25, "Chicago"]]'
df = pd.read_json(json_string, orient="values")
df.columns = ["Name", "Age", "City"]  # Assign column names manually
print(df)

Output:

    Name  Age      City
0  Alice   30  New York
1    Bob   25   Chicago

info

When orient is not specified, Pandas attempts to auto-detect the orientation. For ambiguous JSON structures, explicitly setting orient avoids unexpected results.

Quick Reference: JSON Orientations

Orientation	Structure	Auto-Detected?
records	`[{col: val, ...}, ...]`	Yes
index	`{index: {col: val, ...}, ...}`	Yes
columns	`{col: {index: val, ...}, ...}`	Yes (default)
values	`[[val, val, ...], ...]`	No - must specify
split	`{"index": [...], "columns": [...], "data": [...]}`	No - must specify

Loading JSON From a Python Dictionary

If your data is already a Python dictionary (not a JSON string), use pd.DataFrame() directly or convert it first:

import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [30, 25, 35],
    "City": ["New York", "Chicago", "Houston"]
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Age      City
  Alice   30  New York
    Bob   25   Chicago
Charlie   35   Houston

For a list of dictionaries (records format):

import pandas as pd

records = [
    {"Name": "Alice", "Age": 30},
    {"Name": "Bob", "Age": 25},
    {"Name": "Charlie", "Age": 35}
]

df = pd.DataFrame(records)
print(df)

Output:

      Name  Age
  Alice   30
    Bob   25
Charlie   35

Handling Nested JSON with `json_normalize()`

Real-world JSON data is often nested - objects contain other objects or arrays. pd.read_json() does not flatten nested structures automatically. Use pd.json_normalize() instead:

import pandas as pd

data = [
    {
        "Name": "Alice",
        "Age": 30,
        "Address": {
            "City": "New York",
            "State": "NY"
        }
    },
    {
        "Name": "Bob",
        "Age": 25,
        "Address": {
            "City": "Chicago",
            "State": "IL"
        }
    }
]

df = pd.json_normalize(data)
print(df)

Output:

    Name  Age Address.City Address.State
0  Alice   30     New York            NY
1    Bob   25      Chicago            IL

The nested Address object is automatically flattened into Address.City and Address.State columns.

Common Mistakes and How to Fix Them

Mistake 1: Using the Wrong Variable Name

import pandas

# ❌ 'df' is not defined: 'pandas' is the module name
data = df.read_json("data.json")

Fix: Use the correct module reference:

import pandas as pd

# ✅ Correct
data = pd.read_json("data.json")

Mistake 2: Passing a Dict Instead of a JSON String

import pandas as pd

# ❌ This is a Python dict, not a JSON string
data = {"Name": ["Alice"], "Age": [30]}
df = pd.read_json(data)
# TypeError: Invalid file path or buffer object type: <class 'dict'>

Fix: Convert the dict to a JSON string first, or use pd.DataFrame():

import pandas as pd
import json

# ✅ Option 1: Convert to JSON string
data = {"Name": ["Alice"], "Age": [30]}
df = pd.read_json(json.dumps(data))

# ✅ Option 2: Use DataFrame directly
df = pd.DataFrame(data)

Mistake 3: Wrong Orientation for the Data

import pandas as pd

# This is records-oriented JSON
json_string = '[{"Name": "Alice"}, {"Name": "Bob"}]'

# ❌ Using 'index' orient on records data produces wrong results
df = pd.read_json(json_string, orient="index")

Fix: Match the orient parameter to your actual JSON structure:

# ✅ Correct orientation
df = pd.read_json(json_string, orient="records")

warning

When in doubt about the orientation, load your JSON string with Python's json module first to inspect its structure:

import json

data = json.loads(json_string)
print(type(data))  # list → likely 'records' or 'values'
                    # dict → likely 'columns', 'index', or 'split'

Using `json` Module With `pd.DataFrame()`

For maximum control, parse the JSON manually with Python's built-in json module and then create the DataFrame:

import json
import pandas as pd

json_string = '''
{
    "employees": [
        {"name": "Alice", "department": "Engineering"},
        {"name": "Bob", "department": "Marketing"},
        {"name": "Charlie", "department": "Sales"}
    ]
}
'''

parsed = json.loads(json_string)
df = pd.DataFrame(parsed["employees"])
print(df)

Output:

      name   department
  Alice  Engineering
    Bob    Marketing
Charlie        Sales

This approach is useful when the data you need is nested inside a specific key of the JSON object.

Conclusion

Loading JSON data into a Pandas DataFrame is straightforward with the right tools.

Use pd.read_json() for JSON strings and files with standard orientations, pd.json_normalize() for nested JSON structures that need flattening, and pd.DataFrame() with the json module for maximum control over parsing.

Understanding JSON orientations (records, index, columns, values, and split) is key to ensuring your data loads correctly.

Always match the orient parameter to your JSON structure, and inspect unfamiliar JSON data before loading to avoid silent misinterpretation.

Loading a JSON String Directly Into a DataFrame​

Loading a JSON File Into a DataFrame​

Understanding JSON Orientations​

Records Orientation (Array of Objects)​

Index Orientation​

Column Orientation​

Values Orientation​

Quick Reference: JSON Orientations​

Loading JSON From a Python Dictionary​

Handling Nested JSON with json_normalize()​

Common Mistakes and How to Fix Them​

Mistake 1: Using the Wrong Variable Name​

Mistake 2: Passing a Dict Instead of a JSON String​

Mistake 3: Wrong Orientation for the Data​

Using json Module With pd.DataFrame()​

Conclusion​

Table of Contents

Loading a JSON String Directly Into a DataFrame

Loading a JSON File Into a DataFrame

Understanding JSON Orientations

Records Orientation (Array of Objects)

Index Orientation

Column Orientation

Values Orientation

Quick Reference: JSON Orientations

Loading JSON From a Python Dictionary

Handling Nested JSON with `json_normalize()`

Common Mistakes and How to Fix Them

Mistake 1: Using the Wrong Variable Name

Mistake 2: Passing a Dict Instead of a JSON String

Mistake 3: Wrong Orientation for the Data

Using `json` Module With `pd.DataFrame()`

Conclusion