Python Pandas: How to Check for NaN Values in a Pandas DataFrame
Missing data, represented as NaN (Not a Number) in Pandas, is one of the most common issues you will encounter when working with real-world datasets. Values can be missing because of incomplete data collection, failed API responses, merge mismatches, or simply empty fields in a CSV file. Identifying where and how much data is missing is a critical first step before any analysis, because undetected NaN values can silently distort calculations, break aggregations, and produce misleading results.
In this guide, you will learn how to detect, count, locate, and filter missing values in a Pandas DataFrame using a variety of built-in methods.
Counting NaN Values per Column
The most common starting point for data quality assessment is checking how many missing values exist in each column:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, np.nan, 3, np.nan],
'B': [4, 5, np.nan, 6],
'C': [7, 8, 9, 10]
})
print(df.isna().sum())
Output:
A 2
B 1
C 0
dtype: int64
The .isna() method returns a DataFrame of boolean values (True where a value is NaN, False otherwise), and .sum() counts the True values in each column. This immediately tells you that column A has 2 missing values, column B has 1, and column C is complete.
Checking if Any NaN Exists
Sometimes you just need a quick yes-or-no answer about whether your DataFrame contains any missing data at all:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan], 'B': [3, 4]})
# Returns True if at least one NaN exists anywhere
has_nan = df.isna().any().any()
print(f"DataFrame has missing values: {has_nan}")
# Check which columns contain NaN
print(df.isna().any())
Output:
DataFrame has missing values: True
A True
B False
dtype: bool
The first .any() checks each column, returning a boolean Series. The second .any() collapses that Series into a single boolean. This two-step chain is a useful pattern for quick validation checks in data pipelines.
Getting the Total NaN Count
To find the total number of missing values across the entire DataFrame, chain two .sum() calls:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, np.nan, 3],
'B': [np.nan, np.nan, 6]
})
total_nan = df.isna().sum().sum()
print(f"Total NaN values: {total_nan}")
Output:
Total NaN values: 3
The first .sum() counts per column, and the second .sum() adds those counts together.
Calculating the Percentage of Missing Values
Raw counts are useful, but percentages give a clearer picture of data quality, especially when columns have different lengths or when comparing across datasets:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, np.nan, np.nan, 4],
'B': [np.nan, 2, 3, 4],
'C': [1, 2, 3, 4]
})
pct_missing = (df.isna().sum() / len(df)) * 100
print(pct_missing.round(1))
Output:
A 50.0
B 25.0
C 0.0
dtype: float64
A column with 50% missing values may need different treatment than one with only 25% missing. You can also use df.isna().mean() * 100 as a more concise equivalent.
Filtering Rows That Contain NaN
To see exactly which rows have missing data, use boolean indexing with .isna().any(axis=1):
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, np.nan, 35],
'City': ['NYC', 'LA', np.nan]
})
# Rows with at least one NaN
rows_with_nan = df[df.isna().any(axis=1)]
print("Rows with missing values:")
print(rows_with_nan)
Output:
Rows with missing values:
Name Age City
1 Bob NaN LA
2 Charlie 35.0 NaN
The axis=1 parameter tells .any() to check across columns for each row, returning True for any row that contains at least one NaN.
Filtering Rows Without NaN
To get only the complete rows, negate the condition or use .dropna():
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, np.nan, 35],
'City': ['NYC', 'LA', np.nan]
})
# Using boolean negation
complete_rows = df[~df.isna().any(axis=1)]
print(complete_rows)
# Equivalent shortcut
complete_rows = df.dropna()
print(complete_rows)
Output:
Name Age City
0 Alice 25.0 NYC
Name Age City
0 Alice 25.0 NYC
Checking a Specific Column
When you already know which column to investigate, apply the checks directly to that single column:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Price': [10.5, np.nan, 20.0, np.nan]})
# Count NaN in one column
nan_count = df['Price'].isna().sum()
print(f"Missing prices: {nan_count}")
# Check if column has any NaN
has_missing = df['Price'].isna().any()
print(f"Has missing: {has_missing}")
# Get a boolean mask showing which rows are NaN
print(df['Price'].isna())
Output:
Missing prices: 2
Has missing: True
0 False
1 True
2 False
3 True
Name: Price, dtype: bool
The boolean mask from .isna() can be used directly for filtering or further analysis.
Using df.info() for a Visual Summary
The .info() method provides a quick overview of the DataFrame's structure, including the non-null count for each column:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, np.nan, 3],
'B': [4, 5, np.nan],
'C': [7, 8, 9]
})
df.info()
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 A 2 non-null float64
1 B 2 non-null float64
2 C 3 non-null int64
dtypes: float64(2), int64(1)
memory usage: 204.0 bytes
Columns A and B show "2 non-null" out of 3 entries, indicating one missing value each. This method is especially helpful during initial data exploration because it also shows data types and memory usage.
isna() and isnull() are completely identical in Pandas. Both detect missing values in exactly the same way. The isna() spelling is generally preferred because it is more explicit and aligns with the pd.NA sentinel value introduced in newer Pandas versions.
Quick Reference
| Goal | Method |
|---|---|
| Count NaN per column | df.isna().sum() |
| Total NaN count | df.isna().sum().sum() |
| Check if any NaN exists | df.isna().any().any() |
| Percentage missing | df.isna().mean() * 100 |
| Filter rows with NaN | df[df.isna().any(axis=1)] |
| Filter complete rows | df.dropna() |
| Check one column | df['col'].isna().sum() |
| Visual summary | df.info() |
- Start with
df.isna().sum()as your standard first step in data cleaning to see how missing values are distributed across columns. - Use
df.isna().any().any()for a quick boolean check in automated pipelines. - For deeper investigation, filter rows with
df[df.isna().any(axis=1)]to see exactly which records have gaps and decide how to handle them.