How to Check if a CSV File Is Empty in Python
Reading CSV (Comma-Separated Values) files is one of the most common tasks in data processing with Python. However, attempting to read an empty CSV file can lead to unexpected errors and break your data pipeline. Before processing any CSV file, it's good practice to verify that it actually contains data.
In this guide, you'll learn multiple reliable methods to check if a CSV file is empty in Python, using pandas, the os module, and file reading techniques. Each approach handles different definitions of "empty," from completely blank files to files that contain only a header row.
What Does "Empty" Mean for a CSV File?
Before choosing a method, it's important to clarify what "empty" means in your context:
- Completely empty: The file has zero bytes (no content at all).
- Header only: The file contains a header row but no data rows.
- No meaningful data: The file exists and has rows, but all values are null or blank.
Each method below addresses one or more of these scenarios.
Using the Pandas empty Attribute
The pandas DataFrame.empty attribute returns True if the DataFrame has no rows and False otherwise. This is the most intuitive approach when you're already working with pandas.
import pandas as pd
csv_file_path = 'data.csv'
df = pd.read_csv(csv_file_path)
if df.empty:
print("CSV file is empty.")
else:
print(df.head())
Output (when the file is empty or contains only a header):
CSV file is empty.
Output (when the file contains data):
Name City Age
0 Alice London 30
1 Bob Paris 25
2 Carol Berlin 35
If the CSV file has zero bytes (not even a header), pd.read_csv() raises a pandas.errors.EmptyDataError before you can check the empty attribute:
import pandas as pd
df = pd.read_csv('completely_blank.csv') # Raises EmptyDataError
Output:
pandas.errors.EmptyDataError: No columns to parse from file
To handle this safely, wrap the call in a try/except block as shown in the next section.
Using a Try-Except Block for Robust Handling
A try/except block lets you catch specific exceptions that occur when reading problematic CSV files. This is the recommended approach for production code because it handles multiple failure scenarios gracefully.
import pandas as pd
csv_file_path = 'data.csv'
try:
df = pd.read_csv(csv_file_path)
if df.empty:
print("CSV file contains a header but no data rows.")
else:
print(f"CSV file has {len(df)} rows.")
print(df.head())
except pd.errors.EmptyDataError:
print("CSV file is completely empty (no header, no data).")
except FileNotFoundError:
print(f"File '{csv_file_path}' not found.")
Output (completely blank file):
CSV file is completely empty (no header, no data).
Output (header-only file):
CSV file contains a header but no data rows.
Output (file with data):
CSV file has 3 rows.
Name City Age
0 Alice London 30
1 Bob Paris 25
2 Carol Berlin 35
This method distinguishes between three cases:
- The file doesn't exist (
FileNotFoundError). - The file is completely blank (
EmptyDataError). - The file has a header but no data rows (
df.empty).
Checking File Size with os.path.getsize()
If you want to check whether a CSV file is empty before loading it into pandas, the os module provides a fast way to inspect the file size in bytes.
import os
import pandas as pd
csv_file_path = 'data.csv'
file_size = os.path.getsize(csv_file_path)
if file_size == 0:
print("CSV file is completely empty (0 bytes).")
else:
df = pd.read_csv(csv_file_path)
if df.empty:
print("CSV file contains only a header row.")
else:
print(f"CSV file has {len(df)} rows of data.")
Output (0-byte file):
CSV file is completely empty (0 bytes).
A file with only a header row like Name,City,Age\n will have a size greater than zero. The os.path.getsize() check alone won't catch header-only files, combine it with the df.empty check for complete coverage.
Detecting Header-Only Files Without Pandas
Sometimes you want to check if a CSV file is empty without importing pandas at all. You can count the number of lines in the file directly:
def is_csv_empty(file_path):
"""Check if a CSV file is empty or contains only a header."""
with open(file_path, 'r') as f:
line_count = sum(1 for line in f)
return line_count <= 1
csv_file_path = 'data.csv'
if is_csv_empty(csv_file_path):
print("CSV file is empty or contains only a header.")
else:
print("CSV file has data. Safe to process.")
Output (header-only file):
CSV file is empty or contains only a header.
How it works:
- If the file has 0 lines, it's completely empty.
- If it has exactly 1 line, it contains only the header.
- If it has more than 1 line, it contains data rows.
For very large files, this approach is more efficient than loading the entire file into a pandas DataFrame just to check if it's empty. It reads the file line by line without storing everything in memory.
A More Efficient Line-Count Check
The method above reads the entire file to count lines. For large files, you can optimize by reading only the first two lines:
def is_csv_empty(file_path):
"""Efficiently check if a CSV has data by reading at most 2 lines."""
with open(file_path, 'r') as f:
first_line = f.readline()
if not first_line.strip():
return True # File is completely empty
second_line = f.readline()
if not second_line.strip():
return True # File has only a header
return False
if is_csv_empty('data.csv'):
print("CSV file is empty or header-only.")
else:
print("CSV file contains data.")
This version stops reading after two lines, making it extremely fast regardless of file size.
Combining All Checks Into a Reusable Function
For production-quality code, combine the best aspects of each method into a single reusable function:
import os
import pandas as pd
def check_csv_status(file_path):
"""
Check the status of a CSV file and return a descriptive string.
Returns a tuple of (status, dataframe_or_none).
"""
# Check if file exists
if not os.path.exists(file_path):
return "not_found", None
# Check if file is completely empty (0 bytes)
if os.path.getsize(file_path) == 0:
return "empty_file", None
# Try to read with pandas
try:
df = pd.read_csv(file_path)
except pd.errors.EmptyDataError:
return "empty_file", None
# Check for header-only
if df.empty:
return "header_only", None
return "has_data", df
# Usage
status, df = check_csv_status('data.csv')
if status == "not_found":
print("Error: File not found.")
elif status == "empty_file":
print("Warning: CSV file is completely empty.")
elif status == "header_only":
print("Warning: CSV file contains only a header row.")
else:
print(f"Success: CSV file has {len(df)} rows.")
print(df.head())
Output (file with data):
Success: CSV file has 3 rows.
Name City Age
0 Alice London 30
1 Bob Paris 25
2 Carol Berlin 35
Quick Comparison of Methods
| Method | Detects 0-Byte | Detects Header-Only | Needs Pandas | Performance |
|---|---|---|---|---|
df.empty attribute | ❌ (raises error) | ✅ | ✅ | Moderate |
| Try-except block | ✅ | ✅ | ✅ | Moderate |
os.path.getsize() | ✅ | ❌ | ❌ | ⚡ Fast |
| Line-count check | ✅ | ✅ | ❌ | ⚡ Fast |
| Combined function | ✅ | ✅ | ✅ | Moderate |
Conclusion
Checking whether a CSV file is empty before processing it prevents unexpected crashes and makes your data pipelines more robust. Here's which method to use:
- For quick scripts, use
df.emptyafterpd.read_csv(). It's simple and handles header-only files. - For production code, wrap
pd.read_csv()in a try-except block to catchEmptyDataErrorandFileNotFoundErroralongside theemptycheck. - For performance-sensitive applications, check the file size with
os.path.getsize()or read only the first two lines. Both avoid loading the entire file into memory. - For the most thorough validation, use a combined function that checks file existence, byte size, and DataFrame content in sequence.
Whichever approach you choose, always account for the difference between a completely blank file and a file that contains only a header row because they require different handling strategies.