Skip to main content

Python Pandas: How to Resolve "xlrd.biffh.XLRDError: Excel xlsx file; not supported"

When using the pandas.read_excel() function, you might encounter the xlrd.biffh.XLRDError: Excel xlsx file; not supported. This error occurs because recent versions of the xlrd library (2.0.0 and newer) have intentionally removed support for reading modern .xlsx files, now exclusively handling legacy .xls files. This change was made for security reasons.

The correct solution is to use the openpyxl library as the engine for reading .xlsx files, which is the modern standard for this task within the pandas ecosystem. This guide will walk you through the recommended solution and explain how to handle both .xlsx and older .xls files correctly.

Understanding the Error: The xlrd 2.0.0 Change

The distinction between Excel file formats is crucial here:

  • .xls: The older, binary format used by Excel 97-2003.
  • .xlsx: The modern, XML-based format (Office Open XML) used since Excel 2007. This format can contain macros and other potentially insecure content.

In version 2.0.0, the maintainers of xlrd removed support for .xlsx files due to potential security vulnerabilities. As a result, xlrd is now a dedicated reader for the legacy .xls format only. When a modern pandas installation tries to use xlrd to read an .xlsx file, this error is raised.

The modern and secure way to read .xlsx files with pandas is to use the openpyxl engine.

Example of code causing the error:

import pandas as pd

# This will fail if pandas defaults to an updated xlrd engine.
df = pd.read_excel('example.xlsx')

Output:

Error: Excel xlsx file; not supported

Solution:

Step 1: Install/Upgrade Required Libraries First, ensure you have up-to-date versions of pandas and openpyxl.

pip install --upgrade pandas openpyxl

Step 2: Specify the openpyxl Engine Next, explicitly tell read_excel() to use the openpyxl engine.

import pandas as pd

# ✅ Correct: Specify engine='openpyxl' for .xlsx files.
df = pd.read_excel(
'example.xlsx',
engine='openpyxl'
)

print(df)

Output (assuming example.xlsx contains this data):

    Name  Salary
0 Alice 100
1 Tom 75
2 Carl 150
note

In the latest versions of pandas, if you have openpyxl installed, read_excel() will automatically select it as the default engine for .xlsx files. However, explicitly specifying engine='openpyxl' makes your code clearer and more robust against changes in future pandas versions.

How to Read Legacy .xls Files

The xlrd library is still the correct tool for reading old .xls files. The openpyxl library does not support this format.

Solution:

import pandas as pd

# Ensure xlrd is installed: pip install xlrd

# ✅ Correct: Specify engine='xlrd' for .xls files.
df = pd.read_excel(
'legacy_file.xls',
engine='xlrd'
)

print(df)

Another widely suggested but ill-advised solution is to downgrade xlrd to the last version that supported .xlsx files (1.2.0).

warning

This is not recommended.

  1. Security Risk: You are intentionally using a version with known security vulnerabilities that the new versions were created to fix.
  2. Compatibility Issues: Modern versions of pandas will raise an error if they detect an old version of xlrd, forcing you to downgrade pandas as well and miss out on new features and bug fixes.

Example of discouraged workaround:

# This command installs an old, insecure version of xlrd.
pip install "xlrd==1.2.0"

If you do this, you will likely encounter a new error from pandas itself: ImportError: Pandas requires version '2.0.1' or newer of 'xlrd' (version '1.2.0' currently installed).

Conclusion

The xlrd.biffh.XLRDError: Excel xlsx file; not supported is the result of a deliberate and important library update. The fix is to use the correct engine for the correct file type.

File TypeCorrect EngineInstallation Required
.xlsx (Modern)openpyxlpip install openpyxl
.xls (Legacy)xlrdpip install xlrd

To resolve the error, install openpyxl and use engine='openpyxl' when reading .xlsx files. This is the modern, secure, and officially recommended approach.