Skip to main content

Python Pandas: How to Use the "NOT IN" Filter in Pandas

Filtering rows based on values that should be excluded from a DataFrame is a common data cleaning and analysis task. While SQL has a direct NOT IN clause, Pandas achieves the same result by negating the isin() method with the ~ (tilde) operator.

This guide explains how to apply "NOT IN" filtering on single and multiple columns with clear examples and outputs.

Understanding the "NOT IN" Concept

The isin() method checks whether each element in a column exists in a given list, returning True for matches. By prefixing it with ~, you invert the result - keeping only rows where the value is not in the list:

isin([values])   → True where value IS in the list
~isin([values]) → True where value is NOT in the list

Sample DataFrame

All examples use this DataFrame:

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

print(df)

Output:

     name subject  marks
0 Ethan Python 96
1 Olivia R 89
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92

Filtering a Single Column with NOT IN

Excluding Specific Names

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

# Exclude rows where name is 'Harsha' or 'Jyothika'
exclude_names = ['Harsha', 'Jyothika']
filtered = df[~df['name'].isin(exclude_names)]
print(filtered)

Output:

     name subject  marks
0 Ethan Python 96
1 Olivia R 89
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92

Rows with Harsha and Jyothika are removed.

Excluding Specific Numeric Values

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

# Exclude rows where marks are 96 or 89
exclude_marks = [96, 89]
filtered = df[~df['marks'].isin(exclude_marks)]
print(filtered)

Output:

     name subject  marks
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92

Excluding Values from a String Column

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

# Keep only rows where subject is NOT 'R' or 'PHP'
exclude_subjects = ['R', 'PHP']
filtered = df[~df['subject'].isin(exclude_subjects)]
print(filtered)

Output:

    name subject  marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92

Filtering Multiple Columns with NOT IN

Excluding Rows Where ANY Column Matches

To exclude rows where a value from the exclusion list appears in any of the specified columns, use isin() with any(axis=1):

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

# Exclude rows where name OR marks contains these values
exclude_values = ['Sophia', 96]
filtered = df[~df[['name', 'marks']].isin(exclude_values).any(axis=1)]
print(filtered)

Output:

     name subject  marks
1 Olivia R 89
3 Ava Java 85
4 Liam Python 92
  • Ethan is excluded because marks=96 is in the exclusion list
  • Sophia is excluded because the name is in the exclusion list

Here is what any(axis=1) does:

# Shows which rows have at least one match across the checked columns
check = df[['name', 'marks']].isin(exclude_values)
print(check)

Output:

    name  marks
0 False True
1 False False
2 True False
3 False False
4 False False

any(axis=1) returns True for rows 0 and 1 because at least one column matched. The ~ then inverts this to exclude those rows.

Different Exclusion Lists per Column

When you need separate exclusion criteria for each column, combine individual conditions with & (AND) or | (OR):

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

# Exclude rows where name and subjects
exclude_names = ['Olivia', 'Sophia']
exclude_subjects = ['R', 'PHP']

# Keep rows where name is NOT in exclude_names AND subject is NOT in exclude_subjects
filtered = df[~df['name'].isin(exclude_names) & ~df['subject'].isin(exclude_subjects)]
print(filtered)

Output:

    name subject  marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92

Using query() as an Alternative

The query() method provides a more readable SQL-like syntax:

import pandas as pd

df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})

exclude_names = ['Olivia', 'Sophia']

# Using query(). note: the list variable must be prefixed with @
filtered = df.query("name not in @exclude_names")
print(filtered)

Output:

    name subject  marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92
tip

The query() method supports the not in syntax directly, which reads more naturally than ~df['col'].isin(list). Use @ to reference Python variables inside the query string.

Common Mistake: Forgetting the Tilde (~) Operator

Without the ~, isin() performs an IN filter (keeping matches instead of excluding them):

import pandas as pd

df = pd.DataFrame({
'name': ['David', 'Anna', 'Tom'],
'marks': [96, 89, 90]
})

exclude = ['Tom']

# WRONG: this keeps 'Tom' instead of excluding it
result = df[df['name'].isin(exclude)]
print("Without ~ (IN filter - wrong):")
print(result)

Output:

Without ~ (IN filter - wrong):
name marks
2 Tom 90

The correct approach:

import pandas as pd

df = pd.DataFrame({
'name': ['David', 'Anna', 'Tom'],
'marks': [96, 89, 90]
})

exclude = ['Tom']

# CORRECT: use ~ to negate - this excludes 'Harsha'
result = df[~df['name'].isin(exclude)]
print("With ~ (NOT IN filter - correct):")
print(result)

Output:

With ~ (NOT IN filter - correct):
name marks
0 David 96
1 Anna 89
warning

The ~ operator is essential for NOT IN filtering. Without it, isin() returns the opposite of what you intend - it keeps the values you wanted to exclude.

Quick Reference

GoalCode
NOT IN on one columndf[~df['col'].isin(list)]
NOT IN on multiple columns (any match)df[~df[['c1','c2']].isin(list).any(axis=1)]
NOT IN with separate lists per columndf[~df['c1'].isin(list1) & ~df['c2'].isin(list2)]
NOT IN using query syntaxdf.query("col not in @list")
IN filter (keep matches)df[df['col'].isin(list)]

The NOT IN pattern in Pandas (~df['column'].isin(values)) is a concise and efficient way to exclude unwanted values from your data. Whether you're removing outliers, filtering categories, or cleaning datasets, this technique is one of the most frequently used operations in data analysis with Pandas.