Skip to main content

How to Flatten JSON Objects in Python

JSON (JavaScript Object Notation) is the most common data format for exchanging information between web applications, APIs, and servers. In Python, JSON data is represented as nested dictionaries and lists, which can become deeply nested and difficult to work with, especially when loading data into flat structures like database tables, CSV files, or pandas DataFrames.

Flattening JSON means converting a nested JSON structure into a single-level dictionary with only key-value pairs, where nested keys are joined into compound keys using a separator (typically an underscore).

Example:

# Nested JSON
{"user": {"name": "Rachel", "address": {"city": "NYC", "zip": "10001"}}}

# Flattened JSON
{"user_name": "Rachel", "user_address_city": "NYC", "user_address_zip": "10001"}

This guide covers multiple approaches to flatten JSON objects in Python, from a custom recursive function to using third-party libraries.

Why Flatten JSON?

Flattening JSON is useful in several scenarios:

  • Loading into tabular formats - databases, spreadsheets, and DataFrames require flat key-value structures.
  • Simplifying data access - instead of navigating multiple nesting levels like data["user"]["address"]["city"], you can access data["user_address_city"] directly.
  • Search and filtering - flat structures are easier to query and index.
  • Logging and monitoring - flat key-value pairs are more readable in log outputs.

Using a Recursive Function (No Dependencies)

The most flexible approach is writing a recursive function that traverses the nested structure and builds a flat dictionary. This handles dictionaries, lists, and any level of nesting.

def flatten_json(nested_json, separator="_"):
"""Flatten a nested JSON object into a single-level dictionary."""
flat = {}

def _flatten(obj, parent_key=""):
if isinstance(obj, dict):
for key, value in obj.items():
new_key = f"{parent_key}{separator}{key}" if parent_key else key
_flatten(value, new_key)
elif isinstance(obj, list):
for index, item in enumerate(obj):
new_key = f"{parent_key}{separator}{index}" if parent_key else str(index)
_flatten(item, new_key)
else:
flat[parent_key] = obj

_flatten(nested_json)
return flat


# Example usage
data = {
"user": {
"Rachel": {
"UserID": 1717171717,
"Email": "rachel@example.com",
"friends": ["John", "Jeremy", "Emily"]
}
}
}

result = flatten_json(data)
for key, value in result.items():
print(f"{key}: {value}")

Output:

user_Rachel_UserID: 1717171717
user_Rachel_Email: rachel@example.com
user_Rachel_friends_0: John
user_Rachel_friends_1: Jeremy
user_Rachel_friends_2: Emily

How It Works

  1. The function checks the type of the current object.
  2. If it's a dict, it recurses into each key-value pair, appending the key to the parent path.
  3. If it's a list, it recurses into each item, using the index as the key.
  4. If it's a primitive value (string, number, boolean, null), it stores it in the flat dictionary with the accumulated key path.

Customizing the Separator

You can change the separator to match your needs:

def flatten_json(nested_json, separator="_"):
... # as defined in the example above

data = {"server": {"config": {"port": 8080, "host": "localhost"}}}

print(flatten_json(data, separator="."))
print(flatten_json(data, separator="/"))

Output:

{'server.config.port': 8080, 'server.config.host': 'localhost'}
{'server/config/port': 8080, 'server/config/host': 'localhost'}
tip

Using a dot separator (.) produces keys that resemble JavaScript-style property access paths, which is a common convention in logging frameworks and configuration systems.

Using the flatten_json Library

The flatten_json package provides a ready-made solution with additional features like custom separators and the ability to unflatten data back to its nested form.

Installation

pip install flatten_json

Basic Usage

from flatten_json import flatten

data = {
"user": {
"Rachel": {
"UserID": 1717171717,
"Email": "rachel@example.com",
"friends": ["John", "Jeremy", "Emily"]
}
}
}

flat = flatten(data)

for key, value in flat.items():
print(f"{key}: {value}")

Output:

user_Rachel_UserID: 1717171717
user_Rachel_Email: rachel@example.com
user_Rachel_friends_0: John
user_Rachel_friends_1: Jeremy
user_Rachel_friends_2: Emily

Unflattening Back to Nested JSON

One advantage of the library is the ability to reverse the operation:

from flatten_json import flatten, unflatten_list

data = {
"user": {
"name": "Rachel",
"scores": [95, 87, 92]
}
}

flat = flatten(data)
print("Flattened:", flat)

nested = unflatten_list(flat)
print("Unflattened:", nested)

Output:

Flattened: {'user_name': 'Rachel', 'user_scores_0': 95, 'user_scores_1': 87, 'user_scores_2': 92}
Unflattened: {'user': {'name': 'Rachel', 'scores': [95, 87, 92]}}

Using pandas.json_normalize() for DataFrames

If your end goal is to load JSON data into a pandas DataFrame, pd.json_normalize() flattens nested JSON directly into a tabular format:

import pandas as pd

data = [
{
"name": "Rachel",
"address": {"city": "NYC", "zip": "10001"},
"scores": [95, 87]
},
{
"name": "John",
"address": {"city": "LA", "zip": "90001"},
"scores": [88, 91]
}
]

df = pd.json_normalize(data)
print(df)

Output:

     name    scores address.city address.zip
0 Rachel [95, 87] NYC 10001
1 John [88, 91] LA 90001
note

pd.json_normalize() uses a dot (.) as the default separator for nested keys. It flattens nested dictionaries but leaves lists as-is in the DataFrame cells. For full flattening including lists, use the recursive approach or the flatten_json library first.

Handling Edge Cases

Empty Objects and Lists

Your flattening function should handle empty dicts and lists gracefully:

def flatten_json(nested_json, separator="_"):
"""Flatten a nested JSON object into a single-level dictionary."""
flat = {}

def _flatten(obj, parent_key=""):
if isinstance(obj, dict):
for key, value in obj.items():
new_key = f"{parent_key}{separator}{key}" if parent_key else key
_flatten(value, new_key)
elif isinstance(obj, list):
for index, item in enumerate(obj):
new_key = f"{parent_key}{separator}{index}" if parent_key else str(index)
_flatten(item, new_key)
else:
flat[parent_key] = obj

_flatten(nested_json)
return flat

data = {
"user": {
"name": "Rachel",
"tags": [],
"metadata": {}
}
}

result = flatten_json(data)
print(result)

Output:

{'user_name': 'Rachel'}
warning

With the recursive function shown above, empty lists and empty dicts produce no keys in the output - they're silently dropped. If you need to preserve them, add explicit handling:

def flatten_json_preserve_empty(nested_json, separator="_"):
flat = {}

def _flatten(obj, parent_key=""):
if isinstance(obj, dict):
if not obj: # Empty dict
flat[parent_key] = {}
for key, value in obj.items():
new_key = f"{parent_key}{separator}{key}" if parent_key else key
_flatten(value, new_key)
elif isinstance(obj, list):
if not obj: # Empty list
flat[parent_key] = []
for index, item in enumerate(obj):
new_key = f"{parent_key}{separator}{index}" if parent_key else str(index)
_flatten(item, new_key)
else:
flat[parent_key] = obj

_flatten(nested_json)
return flat


data = {
"user": {
"name": "Rachel",
"tags": [],
"metadata": {}
}
}

result = flatten_json_preserve_empty(data)
print(result)

Output:

{'user_name': 'Rachel', 'user_tags': [], 'user_metadata': {}}

Flattening a JSON File

To flatten JSON data read from a file:

import json

def flatten_json(nested_json, separator="_"):
flat = {}
def _flatten(obj, parent_key=""):
if isinstance(obj, dict):
for key, value in obj.items():
new_key = f"{parent_key}{separator}{key}" if parent_key else key
_flatten(value, new_key)
elif isinstance(obj, list):
for index, item in enumerate(obj):
new_key = f"{parent_key}{separator}{index}" if parent_key else str(index)
_flatten(item, new_key)
else:
flat[parent_key] = obj
_flatten(nested_json)
return flat

# Read and flatten a JSON file
with open("data.json", "r") as f:
data = json.load(f)

flat_data = flatten_json(data)
for key, value in flat_data.items():
print(f"{key}: {value}")

Comparison of Approaches

MethodDependenciesUnflatten SupportBest For
Recursive functionNone❌ (manual)Full control, no dependencies
flatten_json libraryflatten_jsonQuick solution with unflatten support
pd.json_normalize()pandasLoading JSON into DataFrames

Conclusion

Flattening JSON is a fundamental operation when working with nested data in Python.

  • For maximum control and zero dependencies, the recursive approach handles any level of nesting and allows custom separators.
  • The flatten_json library is ideal when you also need to unflatten data back to its original structure.
  • If your goal is to analyze JSON data in pandas, pd.json_normalize() provides the most direct path to a tabular format.

Choose the approach that best fits your workflow and data pipeline requirements.