Python Pandas: How to Resolve "ValueError: Trailing data" Error

When using the pandas.read_json() function, you might encounter the ValueError: Trailing data. This error occurs when the JSON parser successfully reads a complete JSON object or array but then finds additional, unexpected data before the end of the file. Standard JSON files are only allowed to have a single root element (either one object {...} or one array [...]).

The most common cause of this error is trying to read a JSON Lines file (.jsonl, .ndjson), where each line is a separate, valid JSON object. This guide will explain the difference and show you the simple fix by using the lines=True parameter.

Understanding the Error: Standard JSON vs. JSON Lines

The key to solving this error is to understand the format of your file:

Standard JSON: A single file must contain exactly one JSON element. This can be a single object or an array that contains multiple objects. The parser expects to reach the end of the file after this single element is closed.
```
[
  {"name": "Tom"},
  {"name": "John"}
]
```
JSON Lines (.jsonl): A text format where each line is a separate, valid JSON object. This format is common for streaming data and logs.
```
{"name": "Tom"}
{"name": "John"}
```

The ValueError: Trailing data occurs when you use the default read_json() on a JSON Lines file. The parser reads the first object ({"name": "Tom"}), considers its job done, and then unexpectedly finds more data (the second object) on the next line.

Reproducing the `ValueError`

Let's assume you have a file named data.json in the JSON Lines format.

data.json file content:

{ "name": "Tom", "about": "29 years old. A programmer." }
{ "name": "John", "about": "32 years old. A designer." }
{ "name": "Susan", "about": "25 years old. A writer." }

Example of code causing the error:

import pandas as pd

# Incorrect: Default read_json() expects a standard JSON file.
data = pd.read_json('data.json')
print(data)

Output:

Traceback (most recent call last):
  File "main.py", line 9, in <module>
    data = pd.read_json('data.json')
  File "/usr/lib/python3.8/site-packages/pandas/util/_decorators.py", line 199, in wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python3.8/site-packages/pandas/util/_decorators.py", line 296, in wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python3.8/site-packages/pandas/io/json/_json.py", line 618, in read_json
    result = json_reader.read()
  File "/usr/lib/python3.8/site-packages/pandas/io/json/_json.py", line 755, in read
    obj = self._get_object_parser(self.data)
  File "/usr/lib/python3.8/site-packages/pandas/io/json/_json.py", line 777, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()
  File "/usr/lib/python3.8/site-packages/pandas/io/json/_json.py", line 886, in parse
    self._parse_no_numpy()
  File "/usr/lib/python3.8/site-packages/pandas/io/json/_json.py", line 1119, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
ValueError: Trailing data

Solution 1: Use `lines=True` for JSON Lines Format (Recommended)

The pandas.read_json() function has a specific parameter, lines=True, designed to handle the JSON Lines format correctly. This tells pandas to treat each line in the file as an individual JSON object.

Solution:

import pandas as pd

# ✅ Correct: Use the lines=True parameter to parse each line as a JSON object.
data = pd.read_json('data.json', lines=True)

print(data)

Output:

    name                        about
  Tom  29 years old. A programmer.
 John    32 years old. A designer.
Susan      25 years old. A writer.

This is the idiomatic and most efficient way to solve the problem without modifying the source file.

Solution 2: Manually Fix the JSON File Structure

If you have control over the source file and it is small enough to edit, you can convert it into a standard JSON format by wrapping all the objects in a single list and separating them with commas.

Solution: modified data.json file content:

[
  { "name": "Tom", "about": "29 years old. A programmer." },
  { "name": "John", "about": "32 years old. A designer." },
  { "name": "Susan", "about": "25 years old. A writer." }
]

With this corrected format, the default read_json() command will work without any extra parameters.

import pandas as pd

# This now works because the file is in standard JSON format.
data = pd.read_json('data.json')

print(data)

Output:

    name                        about
  Tom  29 years old. A programmer.
 John    32 years old. A designer.
Susan      25 years old. A writer.

Bonus: Cleaning Up Embedded Newline Characters

Sometimes, your JSON data itself might contain newline characters (\n). When you read this with lines=True, these characters are preserved in your DataFrame.

data_with_newlines.json file content:

{ "name": "Tom", "about": "29 years old.\n A programmer." }
{ "name": "John", "about": "32 years old.\n A designer." }

You can easily clean these up after loading the data using the .str.replace() method.

Solution:

import pandas as pd

data = pd.read_json('data_with_newlines.json', lines=True)
print("Before cleaning:\n", data)

# Replace the newline character with a space
data['about'] = data['about'].str.replace('\n', ' ')

print("\nAfter cleaning:\n", data)

Output:

Before cleaning:
    name                          about
0   Tom  29 years old.\n A programmer.
1  John    32 years old.\n A designer.

After cleaning:
    name                         about
0   Tom  29 years old.  A programmer.
1  John    32 years old.  A designer.

Conclusion

If your JSON file has...	The best solution is...
Multiple JSON objects, one on each line.	Use the `lines=True` parameter in `pd.read_json()`.
A single JSON object or array.	Ensure there is no extra text or data after the closing `}` or `]`. The `lines=True` parameter is not needed.

The ValueError: Trailing data is a clear indicator that your file is likely in the JSON Lines format. By using the lines=True parameter, you can instruct pandas to parse it correctly, resolving the error in a single, simple step.

Understanding the Error: Standard JSON vs. JSON Lines​

Reproducing the ValueError​

Solution 1: Use lines=True for JSON Lines Format (Recommended)​

Solution 2: Manually Fix the JSON File Structure​

Bonus: Cleaning Up Embedded Newline Characters​

Conclusion​

Table of Contents

Understanding the Error: Standard JSON vs. JSON Lines

Reproducing the `ValueError`

Solution 1: Use `lines=True` for JSON Lines Format (Recommended)

Solution 2: Manually Fix the JSON File Structure

Bonus: Cleaning Up Embedded Newline Characters

Conclusion