Python NumPy: How to Remove NaN Values from a NumPy Array

NaN (Not a Number) values are a common issue in numerical computing and data analysis. They appear when data is missing, corrupted, or results from undefined mathematical operations (like 0/0). Since NaN values can silently corrupt calculations - producing incorrect sums, means, and comparisons - removing them is often the first step in data cleaning.

In this guide, you'll learn multiple methods to remove NaN values from NumPy arrays, understand the differences between each approach, and choose the right one for your use case.

Understanding NaN in NumPy

NaN is a special floating-point value defined by the IEEE 754 standard. In NumPy, it's represented as np.nan:

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
print("Array:", arr)
print("Sum:", np.sum(arr))
print("Mean:", np.mean(arr))

Output:

Array: [ 1. nan  3. nan  5.]
Sum: nan
Mean: nan

warning

NaN values propagate through calculations. A single NaN in an array makes the entire result of sum(), mean(), max(), and other aggregate functions return nan. This is why removing or handling NaN values is critical before performing any analysis.

Method 1: Using `~np.isnan()` (Recommended)

The most common and Pythonic approach combines np.isnan() with the bitwise NOT operator (~) to create a boolean mask that selects only non-NaN elements:

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])

clean = arr[~np.isnan(arr)]
print("Original:", arr)
print("Cleaned: ", clean)

Output:

Original: [ 1. nan  3. nan  5.]
Cleaned:  [1. 3. 5.]

How It Works

np.isnan(arr) creates a boolean array: [False, True, False, True, False] - True where values are NaN.
~np.isnan(arr) inverts it: [True, False, True, False, True] - True where values are valid.
arr[~np.isnan(arr)] uses boolean indexing to select only the valid elements.

With 2D Arrays

When applied to a 2D array, boolean indexing flattens the result into a 1D array:

import numpy as np

arr = np.array([[12, 5, np.nan, 7],
                [2, 61, 1, np.nan],
                [np.nan, 1, np.nan, 5]])

clean = arr[~np.isnan(arr)]
print("Cleaned (flattened):", clean)

Output:

Cleaned (flattened): [12.  5.  7.  2. 61.  1.  1.  5.]

info

Boolean indexing on multi-dimensional arrays always returns a 1D array because the selected elements don't necessarily form a regular grid. If you need to preserve the 2D structure, see the section on removing rows or columns containing NaN.

Method 2: Using `np.isfinite()`

The np.isfinite() function returns True for all finite numbers, filtering out both NaN and infinity values:

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.inf, 5.0, -np.inf])

clean = arr[np.isfinite(arr)]
print("Original:", arr)
print("Cleaned: ", clean)

Output:

Original: [  1.  nan   3.  inf   5. -inf]
Cleaned:  [1. 3. 5.]

When to use isfinite() vs isnan()

Use ~np.isnan() when you only want to remove NaN values but keep infinite values.
Use np.isfinite() when you want to remove both NaN and infinity values.

In most data analysis scenarios, np.isfinite() is the safer choice because infinite values are usually just as problematic as NaN.

Method 3: Using `np.logical_not()` with `np.isnan()`

This approach is functionally identical to ~np.isnan() but uses an explicit function call instead of the bitwise NOT operator:

import numpy as np

arr = np.array([6.0, 2.0, np.nan, 8.0, np.nan, 1.0])

clean = arr[np.logical_not(np.isnan(arr))]
print("Cleaned:", clean)

Output:

Cleaned: [6. 2. 8. 1.]

The ~ operator and np.logical_not() produce the same result. Use whichever you find more readable.

Removing Rows Containing NaN (Preserving 2D Structure)

If you need to keep the 2D shape of your array, you can remove entire rows that contain any NaN values:

import numpy as np

arr = np.array([[1.0, 2.0, 3.0],
                [4.0, np.nan, 6.0],
                [7.0, 8.0, 9.0],
                [np.nan, 11.0, 12.0]])

# Remove rows where ANY value is NaN
clean = arr[~np.isnan(arr).any(axis=1)]
print("Original shape:", arr.shape)
print("Cleaned shape: ", clean.shape)
print("Cleaned array:")
print(clean)

Output:

Original shape: (4, 3)
Cleaned shape:  (2, 3)
Cleaned array:
[[1. 2. 3.]
 [7. 8. 9.]]

Removing Columns Containing NaN

Similarly, you can remove columns that contain any NaN values:

import numpy as np

arr = np.array([[1.0, np.nan, 3.0],
                [4.0, np.nan, 6.0],
                [7.0, np.nan, 9.0]])

# Remove columns where ANY value is NaN
clean = arr[:, ~np.isnan(arr).any(axis=0)]
print("Cleaned array:")
print(clean)

Output:

Cleaned array:
[[1. 3.]
 [4. 6.]
 [7. 9.]]

Replacing NaN Instead of Removing

Sometimes you don't want to remove NaN values (which changes the array size) but instead replace them with a specific value:

Replace with Zero

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])

clean = np.nan_to_num(arr, nan=0.0)
print("Replaced with 0:", clean)

Output:

Replaced with 0: [1. 0. 3. 0. 5.]

Replace with the Mean

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])

mean_val = np.nanmean(arr)  # Computes mean ignoring NaN
arr[np.isnan(arr)] = mean_val
print(f"Replaced with mean ({mean_val}):", arr)

Output:

Replaced with mean (3.0): [1. 3. 3. 3. 5.]

tip

NumPy provides NaN-aware versions of common functions: np.nansum(), np.nanmean(), np.nanmax(), np.nanmin(), np.nanstd(). These compute results while ignoring NaN values, which can be simpler than removing NaN values first.

import numpy as np

arr = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
print("nanmean:", np.nanmean(arr))  # nanmean: 3.0
print("nansum:", np.nansum(arr))    # nansum: 9.0

Common Mistake: Using `==` to Check for NaN

A frequent error is trying to compare values with np.nan using the equality operator:

import numpy as np

arr = np.array([1.0, np.nan, 3.0])

# ❌ Wrong: NaN is NOT equal to itself
mask = arr != np.nan
print("Mask:", mask)
print("Result:", arr[mask])

Output:

Mask: [ True  True  True]
Result: [ 1. nan  3.]

The NaN value is not removed because, by IEEE 754 definition, NaN != NaN is True. This means np.nan == np.nan returns False, making equality comparisons useless for detecting NaN.

Always use np.isnan():

import numpy as np

arr = np.array([1.0, np.nan, 3.0])

# ✅ Correct: use np.isnan()
mask = ~np.isnan(arr)
print("Result:", arr[mask])

Output:

Result: [1. 3.]

Comparison of Methods

Method	Removes NaN	Removes `inf`	Preserves Shape	Use Case
`arr[~np.isnan(arr)]`	✅	❌	❌ (flattens)	Most common, NaN only
`arr[np.isfinite(arr)]`	✅	✅	❌ (flattens)	Remove all non-finite values
`arr[~np.isnan(arr).any(axis=1)]`	✅	❌	✅ (removes rows)	Clean rows in 2D arrays
`np.nan_to_num(arr)`	Replaces	Replaces	✅	Keep shape, fill with defaults
`np.nanmean()`, `np.nansum()`	Ignores	❌	✅	Compute stats without cleaning

Summary

To remove NaN values from a NumPy array:

Use arr[~np.isnan(arr)] for the simplest, most common approach - it filters out all NaN values from any array.
Use arr[np.isfinite(arr)] when you also need to remove infinity values.
Use arr[~np.isnan(arr).any(axis=1)] to remove entire rows containing NaN while preserving the 2D structure.
Use np.nan_to_num() or assignment with np.isnan() when you want to replace NaN values instead of removing them.
Never use == or != to check for NaN - always use np.isnan().

Understanding NaN in NumPy​

Method 1: Using ~np.isnan() (Recommended)​

How It Works​

With 2D Arrays​

Method 2: Using np.isfinite()​

Method 3: Using np.logical_not() with np.isnan()​

Removing Rows Containing NaN (Preserving 2D Structure)​

Removing Columns Containing NaN​

Replacing NaN Instead of Removing​

Replace with Zero​

Replace with the Mean​

Common Mistake: Using == to Check for NaN​

Comparison of Methods​

Summary​

Table of Contents

Understanding NaN in NumPy

Method 1: Using `~np.isnan()` (Recommended)

How It Works

With 2D Arrays

Method 2: Using `np.isfinite()`

Method 3: Using `np.logical_not()` with `np.isnan()`

Removing Rows Containing NaN (Preserving 2D Structure)

Removing Columns Containing NaN

Replacing NaN Instead of Removing

Replace with Zero

Replace with the Mean

Common Mistake: Using `==` to Check for NaN

Comparison of Methods

Summary