Skip to main content

Python Pandas: How to Get Elements of a Pandas Series Not Present in Another Series

When comparing two datasets in Pandas, a common task is finding elements that exist in one Series but not in another, essentially computing the set difference. This is useful for identifying missing records, detecting new entries, finding unmatched values between datasets, or validating data consistency.

In this guide, you will learn how to find elements exclusive to one Series using the bitwise NOT operator (~) combined with isin(), along with alternative methods for different use cases.

The Core Technique: ~ With isin()

The isin() method checks whether each element of a Series is contained in another Series (or list), returning a Boolean mask. By applying the bitwise NOT operator (~), you invert the mask to select elements that are not present in the other Series:

result = series1[~series1.isin(series2)]

This reads as: "Give me all elements from series1 where the element is NOT in series2."

Example 1: Integer Series

import pandas as pd

ps1 = pd.Series([2, 4, 8, 20, 10, 47, 99])
ps2 = pd.Series([1, 3, 6, 4, 10, 99, 50])

print("Series 1:")
print(ps1.values)

print("\nSeries 2:")
print(ps2.values)

# Elements in ps1 that are NOT in ps2
result = ps1[~ps1.isin(ps2)]
print("\nElements in Series 1 but not in Series 2:")
print(result)

Output:

Series 1:
[ 2 4 8 20 10 47 99]

Series 2:
[ 1 3 6 4 10 99 50]

Elements in Series 1 but not in Series 2:
0 2
2 8
3 20
5 47
dtype: int64

Values 4, 10, and 99 are present in both Series, so they are excluded. The remaining values (2, 8, 20, 47) appear only in ps1.

Example 2: Floating-Point Series

The same approach works with float values:

import pandas as pd

ps1 = pd.Series([2.8, 4.5, 8.0, 2.2, 10.1, 4.7, 9.9])
ps2 = pd.Series([1.4, 2.8, 4.7, 4.8, 10.1, 9.9, 50.12])

result = ps1[~ps1.isin(ps2)]
print("Elements in ps1 but not in ps2:")
print(result)

Output:

Elements in ps1 but not in ps2:
1 4.5
2 8.0
3 2.2
dtype: float64
Floating-point comparison precision

isin() performs exact equality checks. Due to floating-point arithmetic, values that appear identical may not match exactly:

import pandas as pd

ps1 = pd.Series([0.1 + 0.2]) # Results in 0.30000000000000004
ps2 = pd.Series([0.3])

print(ps1[~ps1.isin(ps2)])
# 0 0.3 ← Not filtered out because 0.1+0.2 ≠ 0.3 exactly

For float comparisons requiring tolerance, use NumPy's np.isclose() instead of isin().

Example 3: String Series

import pandas as pd

ps1 = pd.Series(['Alice', 'Bob', 'Charlie', 'Diana', 'Eve', 'Frank'])
ps2 = pd.Series(['Grace', 'Heidi', 'Diana', 'Ivan', 'Frank', 'Judy'])

result = ps1[~ps1.isin(ps2)]
print("Names in ps1 but not in ps2:")
print(result)

Output:

Names in ps1 but not in ps2:
0 Alice
1 Bob
2 Charlie
4 Eve
dtype: object

'Diana' and 'Frank' are present in both, so they are excluded from the result.

Finding the Difference in Both Directions

To find elements exclusive to each Series (symmetric difference), apply the operation in both directions:

import pandas as pd

ps1 = pd.Series([1, 2, 3, 4, 5])
ps2 = pd.Series([3, 4, 5, 6, 7])

only_in_ps1 = ps1[~ps1.isin(ps2)]
only_in_ps2 = ps2[~ps2.isin(ps1)]

print("Only in ps1:", only_in_ps1.values)
print("Only in ps2:", only_in_ps2.values)

Output:

Only in ps1: [1 2]
Only in ps2: [6 7]

Alternative Methods

Using Set Operations

For simple cases where you only need the values (not the index), Python sets are concise:

import pandas as pd

ps1 = pd.Series([2, 4, 8, 20, 10, 47, 99])
ps2 = pd.Series([1, 3, 6, 4, 10, 99, 50])

# Set difference
diff = set(ps1) - set(ps2)
print("Difference:", diff)

Output:

Difference: {8, 2, 20, 47}
Sets vs. isin(): when to use which
ApproachPreserves IndexPreserves OrderHandles DuplicatesReturns
~isin()✅ Yes✅ Yes✅ Keeps all occurrencesPandas Series
Set difference❌ No❌ No❌ Removes duplicatesPython set

Use ~isin() when you need to maintain the DataFrame context (index, order, duplicates). Use sets when you just need the distinct values.

Using numpy.setdiff1d()

NumPy provides a dedicated function for finding the set difference:

import pandas as pd
import numpy as np

ps1 = pd.Series([2, 4, 8, 20, 10, 47, 99])
ps2 = pd.Series([1, 3, 6, 4, 10, 99, 50])

diff = np.setdiff1d(ps1, ps2)
print("Difference:", diff)

Output:

Difference: [ 2  8 20 47]

The result is a sorted NumPy array of unique values.

Practical Example: Finding Missing Records

A common real-world use case is finding records that exist in one dataset but not another:

import pandas as pd

# All expected student IDs
expected = pd.Series([101, 102, 103, 104, 105, 106, 107, 108])

# IDs that submitted assignments
submitted = pd.Series([101, 103, 105, 107])

# Who hasn't submitted?
missing = expected[~expected.isin(submitted)]
print("Students who haven't submitted:")
print(missing)

Output:

Students who haven't submitted:
1 102
3 104
5 106
7 108
dtype: int64

Handling Duplicates

The ~isin() approach preserves duplicates in the source Series. If an element appears multiple times in ps1 and is not in ps2, all occurrences are returned:

import pandas as pd

ps1 = pd.Series([1, 2, 2, 3, 3, 3, 4])
ps2 = pd.Series([2, 4])

result = ps1[~ps1.isin(ps2)]
print(result)

Output:

0    1
3 3
4 3
5 3
dtype: int64

All three occurrences of 3 are included since 3 is not in ps2.

Complete Comparison Example

import pandas as pd

# Product catalogs from two stores
store_a = pd.Series(['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam'])
store_b = pd.Series(['Keyboard', 'Headphones', 'Mouse', 'Speaker', 'Mic'])

print("Products ONLY in Store A:")
print(store_a[~store_a.isin(store_b)].values)

print("\nProducts ONLY in Store B:")
print(store_b[~store_b.isin(store_a)].values)

print("\nProducts in BOTH stores:")
print(store_a[store_a.isin(store_b)].values)

Output:

Products ONLY in Store A:
['Laptop' 'Monitor' 'Webcam']

Products ONLY in Store B:
['Headphones' 'Speaker' 'Mic']

Products in BOTH stores:
['Mouse' 'Keyboard']

Conclusion

Finding elements in one Pandas Series that are absent from another is accomplished by combining the bitwise NOT operator (~) with isin(): series1[~series1.isin(series2)].

This approach works with any data type (integers, floats, and strings) and preserves the original index and order.

For simple distinct-value comparisons, Python sets or numpy.setdiff1d() offer concise alternatives.

This technique is essential for data validation, record matching, and identifying discrepancies between datasets.