How to Merge Multiple JSON Files Using Python

When working with data pipelines, APIs, or configuration systems, you'll often encounter situations where related data is spread across multiple JSON files. For example, you might have daily log exports, per-user configuration files, or paginated API responses that need to be combined into a single unified file for analysis or processing.

In this guide, you'll learn multiple approaches to merge JSON files in Python: from simple built-in modules to automated directory scanning: along with best practices for handling different JSON structures.

Sample JSON Files

Throughout this guide, we'll use the following sample JSON files:

users.json
{"name": "Alice", "age": 30, "city": "New York"}

orders.json
{"name": "Bob", "age": 25, "city": "Chicago"}

products.json
{"name": "Charlie", "age": 35, "city": "Boston"}

Method 1: Using the `json` Module with Explicit File Paths

The most straightforward approach uses Python's built-in json module to read each file and append its data to a list:

import json


def merge_json_files(file_paths, output_file):
    merged_data = []

    for path in file_paths:
        with open(path, 'r') as file:
            data = json.load(file)
            merged_data.append(data)

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    print(f"Merged {len(file_paths)} files into '{output_file}'")
    return merged_data


# Specify files explicitly
file_paths = ["users.json", "orders.json", "products.json"]
result = merge_json_files(file_paths, "merged.json")
print(result)

Output (merged.json):

[
    {'name': 'Alice', 'age': 30, 'city': 'New York'}, 
    {'name': 'Bob', 'age': 25, 'city': 'Chicago'}, 
    {'name': 'Charlie', 'age': 35, 'city': 'Boston'}
]

tip

Always use indent=2 (or indent=4) in json.dump() when the output file needs to be human-readable. Without it, the entire JSON is written on a single line, making it difficult to inspect.

Method 2: Using List Comprehension for Concise Code

If you prefer a more compact style, list comprehension allows you to read and merge files in fewer lines:

import json


def merge_json_files(file_paths):
    merged_data = []
    for path in file_paths:
        with open(path, 'r') as f:
            merged_data.append(json.load(f))
    return merged_data


file_paths = ["users.json", "orders.json", "products.json"]
merged_data = merge_json_files(file_paths)

with open("merged.json", 'w') as outfile:
    json.dump(merged_data, outfile, indent=2)

print(merged_data)

Output:

[
    {'name': 'Alice', 'age': 30, 'city': 'New York'}, 
    {'name': 'Bob', 'age': 25, 'city': 'Chicago'},
     {'name': 'Charlie', 'age': 35, 'city': 'Boston'}
]

warning

You might see examples that use json.load(open(path, 'r')) inside a list comprehension without a with statement. This is bad practice because it doesn't guarantee the file handle is properly closed:

# ❌ Bad: file handles may not be closed properly
merged_data = [json.load(open(path, 'r')) for path in file_paths]

Always use with open(...) to ensure files are closed correctly, even if an error occurs during reading.

Method 3: Scanning a Directory with `os`

When you have all your JSON files in a single directory and don't want to list them manually, use os.listdir() to discover them automatically:

import json
import os


def merge_json_from_directory(directory_path, output_file):
    merged_data = []

    for filename in sorted(os.listdir(directory_path)):
        if filename.endswith('.json'):
            filepath = os.path.join(directory_path, filename)
            with open(filepath, 'r') as file:
                data = json.load(file)
                merged_data.append(data)
            print(f"  Read: {filename}")

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    print(f"\nMerged {len(merged_data)} files into '{output_file}'")
    return merged_data


result = merge_json_from_directory("./data", "merged.json")

Output:

  Read: orders.json
  Read: products.json
  Read: users.json

Merged 3 files into 'merged.json'

info

Using sorted() on os.listdir() ensures files are processed in alphabetical order. Without sorting, the order depends on the filesystem and may vary across operating systems.

Method 4: Using `glob` for Pattern Matching

The glob module offers more flexible file discovery with wildcard patterns. This is especially useful when your JSON files follow a naming convention:

import json
import glob


def merge_json_with_glob(pattern, output_file):
    merged_data = []
    file_paths = sorted(glob.glob(pattern))

    if not file_paths:
        print(f"No files found matching pattern: {pattern}")
        return []

    for path in file_paths:
        with open(path, 'r') as file:
            data = json.load(file)
            merged_data.append(data)

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    print(f"Merged {len(file_paths)} files into '{output_file}'")
    return merged_data


# Merge all JSON files in the "data" directory
result = merge_json_with_glob("data/*.json", "merged.json")

# Or match a specific pattern
# result = merge_json_with_glob("data/report_2025_*.json", "merged_reports.json")

Method 5: Merging into a Dictionary Instead of a List

The previous methods merge JSON objects into a list (array). Sometimes you need to merge them into a single dictionary where all key-value pairs are combined:

import json
import glob


def merge_json_as_dict(pattern, output_file):
    merged_data = {}

    for path in sorted(glob.glob(pattern)):
        with open(path, 'r') as file:
            data = json.load(file)
            merged_data.update(data)

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    return merged_data


result = merge_json_as_dict("data/*.json", "merged_dict.json")
print(result)

Output (merged_dict.json):

{
  "name": "Charlie",
  "age": 35,
  "city": "Boston"
}

warning

When using dict.update(), duplicate keys are overwritten by the last file processed. In the example above, since all three files have "name", "age", and "city" keys, only the values from the last file (products.json) are preserved. Use the list-based approach if you need to keep all records.

Method 6: Using Pandas for Structured Data

When your JSON files contain records with the same schema (same keys), Pandas can merge them into a structured DataFrame, which is ideal for further analysis:

import pandas as pd
import glob


def merge_json_with_pandas(pattern, output_file):
    file_paths = sorted(glob.glob(pattern))
    dataframes = []

    for path in file_paths:
        df = pd.read_json(path, typ='series')
        dataframes.append(df)

    merged_df = pd.DataFrame(dataframes).reset_index(drop=True)

    # Save as JSON array of records
    merged_df.to_json(output_file, orient='records', indent=2)

    print(f"Merged {len(file_paths)} files into '{output_file}'")
    return merged_df


result = merge_json_with_pandas("data/*.json", "merged_pandas.json")
print(result)

Output:

Merged 3 files into 'merged_pandas.json'
      name  age      city
0    Alice   30  New York
1      Bob   25   Chicago
2  Charlie   35    Boston

merged_pandas.json:

[
  {"name": "Alice", "age": 30, "city": "New York"},
  {"name": "Bob", "age": 25, "city": "Chicago"},
  {"name": "Charlie", "age": 35, "city": "Boston"}
]

tip

Use orient='records' with to_json() to produce a clean JSON array where each object represents one row. Other orientations like 'columns' or 'index' produce structures that are harder to work with downstream.

Handling Edge Cases

Files Containing JSON Arrays

If your JSON files already contain arrays (lists of objects) rather than single objects, you need to extend rather than append:

import json


def merge_json_arrays(file_paths, output_file):
    merged_data = []

    for path in file_paths:
        with open(path, 'r') as file:
            data = json.load(file)
            if isinstance(data, list):
                merged_data.extend(data)  # Flatten arrays
            else:
                merged_data.append(data)  # Single objects

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    return merged_data

Handling Invalid JSON Files

In real-world scenarios, some files might contain malformed JSON. Add error handling to skip problematic files gracefully:

import json
import glob


def safe_merge_json(pattern, output_file):
    merged_data = []
    errors = []

    for path in sorted(glob.glob(pattern)):
        try:
            with open(path, 'r') as file:
                data = json.load(file)
                merged_data.append(data)
        except json.JSONDecodeError as e:
            errors.append(path)
            print(f"Warning: Skipping '{path}': invalid JSON ({e})")

    with open(output_file, 'w') as outfile:
        json.dump(merged_data, outfile, indent=2)

    print(f"\nMerged: {len(merged_data)} files | Skipped: {len(errors)} files")
    return merged_data


result = safe_merge_json("data/*.json", "merged_safe.json")

Choosing the Right Approach

Method	Best For
`json` module with explicit paths	Small number of known files
`os.listdir()`	All JSON files in a single directory
`glob.glob()`	Pattern-based file discovery
Dictionary merge (`update()`)	Combining key-value pairs (no duplicates)
Pandas	Structured/tabular JSON data for analysis

Summary

Merging multiple JSON files in Python is straightforward with the built-in json module. Use explicit file lists for small, known sets of files, or glob/os for automatic discovery in directories.

Choose between list-based merging (preserves all records) and dictionary-based merging (combines key-value pairs) depending on your data structure.

Always handle potential issues like malformed JSON and duplicate keys, and use indent in json.dump() to keep the output readable.

Sample JSON Files​

Method 1: Using the json Module with Explicit File Paths​

Method 2: Using List Comprehension for Concise Code​

Method 3: Scanning a Directory with os​

Method 4: Using glob for Pattern Matching​

Method 5: Merging into a Dictionary Instead of a List​

Method 6: Using Pandas for Structured Data​

Handling Edge Cases​

Files Containing JSON Arrays​

Handling Invalid JSON Files​

Choosing the Right Approach​

Summary​

Table of Contents