Python Pandas: How to Use the "NOT IN" Filter in Pandas
Filtering rows based on values that should be excluded from a DataFrame is a common data cleaning and analysis task. While SQL has a direct NOT IN clause, Pandas achieves the same result by negating the isin() method with the ~ (tilde) operator.
This guide explains how to apply "NOT IN" filtering on single and multiple columns with clear examples and outputs.
Understanding the "NOT IN" Concept
The isin() method checks whether each element in a column exists in a given list, returning True for matches. By prefixing it with ~, you invert the result - keeping only rows where the value is not in the list:
isin([values]) → True where value IS in the list
~isin([values]) → True where value is NOT in the list
Sample DataFrame
All examples use this DataFrame:
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
print(df)
Output:
name subject marks
0 Ethan Python 96
1 Olivia R 89
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92
Filtering a Single Column with NOT IN
Excluding Specific Names
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
# Exclude rows where name is 'Harsha' or 'Jyothika'
exclude_names = ['Harsha', 'Jyothika']
filtered = df[~df['name'].isin(exclude_names)]
print(filtered)
Output:
name subject marks
0 Ethan Python 96
1 Olivia R 89
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92
Rows with Harsha and Jyothika are removed.
Excluding Specific Numeric Values
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
# Exclude rows where marks are 96 or 89
exclude_marks = [96, 89]
filtered = df[~df['marks'].isin(exclude_marks)]
print(filtered)
Output:
name subject marks
2 Sophia PHP 90
3 Ava Java 85
4 Liam Python 92
Excluding Values from a String Column
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
# Keep only rows where subject is NOT 'R' or 'PHP'
exclude_subjects = ['R', 'PHP']
filtered = df[~df['subject'].isin(exclude_subjects)]
print(filtered)
Output:
name subject marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92
Filtering Multiple Columns with NOT IN
Excluding Rows Where ANY Column Matches
To exclude rows where a value from the exclusion list appears in any of the specified columns, use isin() with any(axis=1):
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
# Exclude rows where name OR marks contains these values
exclude_values = ['Sophia', 96]
filtered = df[~df[['name', 'marks']].isin(exclude_values).any(axis=1)]
print(filtered)
Output:
name subject marks
1 Olivia R 89
3 Ava Java 85
4 Liam Python 92
Ethanis excluded becausemarks=96is in the exclusion listSophiais excluded because the name is in the exclusion list
Here is what any(axis=1) does:
# Shows which rows have at least one match across the checked columns
check = df[['name', 'marks']].isin(exclude_values)
print(check)
Output:
name marks
0 False True
1 False False
2 True False
3 False False
4 False False
any(axis=1) returns True for rows 0 and 1 because at least one column matched. The ~ then inverts this to exclude those rows.
Different Exclusion Lists per Column
When you need separate exclusion criteria for each column, combine individual conditions with & (AND) or | (OR):
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
# Exclude rows where name and subjects
exclude_names = ['Olivia', 'Sophia']
exclude_subjects = ['R', 'PHP']
# Keep rows where name is NOT in exclude_names AND subject is NOT in exclude_subjects
filtered = df[~df['name'].isin(exclude_names) & ~df['subject'].isin(exclude_subjects)]
print(filtered)
Output:
name subject marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92
Using query() as an Alternative
The query() method provides a more readable SQL-like syntax:
import pandas as pd
df = pd.DataFrame({
'name': ['Ethan', 'Olivia', 'Sophia', 'Ava', 'Liam'],
'subject': ['Python', 'R', 'PHP', 'Java', 'Python'],
'marks': [96, 89, 90, 85, 92]
})
exclude_names = ['Olivia', 'Sophia']
# Using query(). note: the list variable must be prefixed with @
filtered = df.query("name not in @exclude_names")
print(filtered)
Output:
name subject marks
0 Ethan Python 96
3 Ava Java 85
4 Liam Python 92
The query() method supports the not in syntax directly, which reads more naturally than ~df['col'].isin(list). Use @ to reference Python variables inside the query string.
Common Mistake: Forgetting the Tilde (~) Operator
Without the ~, isin() performs an IN filter (keeping matches instead of excluding them):
import pandas as pd
df = pd.DataFrame({
'name': ['David', 'Anna', 'Tom'],
'marks': [96, 89, 90]
})
exclude = ['Tom']
# WRONG: this keeps 'Tom' instead of excluding it
result = df[df['name'].isin(exclude)]
print("Without ~ (IN filter - wrong):")
print(result)
Output:
Without ~ (IN filter - wrong):
name marks
2 Tom 90
The correct approach:
import pandas as pd
df = pd.DataFrame({
'name': ['David', 'Anna', 'Tom'],
'marks': [96, 89, 90]
})
exclude = ['Tom']
# CORRECT: use ~ to negate - this excludes 'Harsha'
result = df[~df['name'].isin(exclude)]
print("With ~ (NOT IN filter - correct):")
print(result)
Output:
With ~ (NOT IN filter - correct):
name marks
0 David 96
1 Anna 89
The ~ operator is essential for NOT IN filtering. Without it, isin() returns the opposite of what you intend - it keeps the values you wanted to exclude.
Quick Reference
| Goal | Code |
|---|---|
| NOT IN on one column | df[~df['col'].isin(list)] |
| NOT IN on multiple columns (any match) | df[~df[['c1','c2']].isin(list).any(axis=1)] |
| NOT IN with separate lists per column | df[~df['c1'].isin(list1) & ~df['c2'].isin(list2)] |
| NOT IN using query syntax | df.query("col not in @list") |
| IN filter (keep matches) | df[df['col'].isin(list)] |
The NOT IN pattern in Pandas (~df['column'].isin(values)) is a concise and efficient way to exclude unwanted values from your data. Whether you're removing outliers, filtering categories, or cleaning datasets, this technique is one of the most frequently used operations in data analysis with Pandas.