Skip to main content

Python Pandas: How to Filter Rows Using Pandas Chaining in Python

Method chaining is a programming style where multiple operations are applied to a DataFrame sequentially in a single expression, one after another. Instead of creating intermediate variables for each step, you chain methods together to produce clean, readable, and concise data transformations.

This guide demonstrates how to filter DataFrame rows using Pandas method chaining with various techniques - from simple value matching to complex multi-condition filtering - along with clear examples and outputs.

Creating a Sample DataFrame

Let's start with a sample DataFrame that we'll use throughout this guide:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})

print(data)

Output:

    ID               Name  Age    Country
0 105 Ram Kumar 40 India
1 102 Jack Wills 23 Uk
2 101 Deepanshu Rustagi 20 India
3 106 Thomas James 34 Australia
4 103 Jenny Advekar 18 Uk
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India

Filtering by a Specific Value

Using .eq() for Exact Match

The .eq() method checks for equality and returns a boolean Series. When used inside bracket notation, it filters rows where the condition is True:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})

# Filter rows where Country is "India"
result = data[data.Country.eq("India")]
print(result)

Output:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India

Using .pipe() for Chainable Filtering

The .pipe() method passes the entire DataFrame to a function, enabling custom filtering logic within a chain:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})

# Filter using pipe: chainable and reusable
result = (
data
.pipe(lambda df: df[df["Country"] == "India"])
)
print(result)

Output:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India
tip

.pipe() is especially powerful when you define reusable filter functions:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


def filter_by_country(df, country):
return df[df["Country"] == country]

result = data.pipe(filter_by_country, "Uk")
print(result)

Output:

    ID           Name  Age Country
1 102 Jack Wills 23 Uk
4 103 Jenny Advekar 18 Uk

Filtering by Numeric Conditions

Using Boolean Indexing

The most straightforward approach - apply a comparison directly:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


# Filter rows where Age is less than 30
result = data[data.Age < 30]
print(result)

Output:

    ID               Name  Age Country
1 102 Jack Wills 23 Uk
2 101 Deepanshu Rustagi 20 India
4 103 Jenny Advekar 18 Uk

Chaining .loc[] with Lambda Functions

You can chain multiple .loc[] calls, each using a lambda function to apply a filter condition. This is one of the most expressive forms of Pandas chaining:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


# Chain multiple conditions: ID <= 103 AND Age == 23
result = (
data
.loc[lambda df: df["ID"] <= 103]
.loc[lambda df: df["Age"] == 23]
)
print(result)

Output:

    ID        Name  Age Country
1 102 Jack Wills 23 Uk

Chaining Multiple Filters with .query()

The .query() method accepts string expressions and is naturally chainable:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})

# Chain multiple query conditions
result = (
data
.query("Age >= 20")
.query("Age <= 40")
.query("Country == 'India'")
)
print(result)

Output:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
6 107 Raman Dutt Mishra 35 India

You can also combine all conditions in a single .query() call:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


result = data.query("Age >= 20 and Age <= 40 and Country == 'India'")
print(result)

Output is the same:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
6 107 Raman Dutt Mishra 35 India

Filtering by String Patterns

Using .str.contains() for Substring Matching

The .str.contains() method filters rows where a column contains a specific substring:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


# Filter rows where Name contains "am"
result = data[data.Name.str.contains("am")]
print(result)

Output:

    ID               Name  Age    Country
0 105 Ram Kumar 40 India
3 106 Thomas James 34 Australia
6 107 Raman Dutt Mishra 35 India

Chaining String Filters

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


# Chain: Names containing "am" AND from India
result = (
data
.loc[lambda df: df["Name"].str.contains("am")]
.loc[lambda df: df["Country"] == "India"]
)
print(result)

Output:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
6 107 Raman Dutt Mishra 35 India
note

By default, .str.contains() interprets the pattern as a regular expression. To search for literal strings (especially those containing special regex characters like . or *), set regex=False:

data[data.Name.str.contains("R.", regex=False)]  # Literal "R."

Filtering by a Set of Values

Using .isin() for Multiple Value Matching

The .isin() method checks whether each value is in a given list of acceptable values:

# Filter rows where Country is either "Uk" or "Australia"
target_countries = ["Uk", "Australia"]
result = data[data.Country.isin(target_countries)]
print(result)

Output:

    ID           Name  Age    Country
1 102 Jack Wills 23 Uk
3 106 Thomas James 34 Australia
4 103 Jenny Advekar 18 Uk

Excluding Values with ~ (Negation)

Invert .isin() to exclude specific values:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})


# Filter rows where Country is NOT "Uk" or "Australia"
result = data[~data.Country.isin(["Uk", "Australia"])]
print(result)

Output:

    ID               Name  Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India

Building Complex Chains

The real power of method chaining emerges when you combine filtering, transformation, and aggregation in a single expression:

import pandas as pd

data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})

result = (
data
.loc[lambda df: df["Country"] == "India"] # Filter by country
.loc[lambda df: df["Age"] > 30] # Filter by age
.assign(Age_Group=lambda df: df["Age"].apply( # Add a new column
lambda x: "Senior" if x >= 50 else "Middle"
))
.sort_values("Age", ascending=False) # Sort by age
.reset_index(drop=True) # Reset the index
)
print(result)

Output:

    ID               Name  Age Country Age_Group
0 104 Yash Raj 56 India Senior
1 105 Ram Kumar 40 India Middle
2 107 Raman Dutt Mishra 35 India Middle

This chain performs four operations without a single intermediate variable - each step feeds its output into the next.

Common Mistake: Chaining on a Copy Warning

When you chain assignments (modifying data), Pandas may raise a SettingWithCopyWarning:

# ❌ May raise a warning: modifying a slice of the original DataFrame
data[data["Country"] == "India"]["Age"] = 0

The correct approach is to use .loc[] for combined filtering and assignment:

# ✅ Correct: no warning
data.loc[data["Country"] == "India", "Age"] = 0

Or use .assign() within a chain to create new columns without modifying the original:

# ✅ Correct: creates a new DataFrame
result = data.assign(Age_Doubled=lambda df: df["Age"] * 2)

Summary of Chaining Methods for Filtering

MethodUse CaseChainable
df[condition]Simple boolean filtering⚠️ Limited
df.loc[lambda df: ...]Flexible, multi-step filtering✅ Excellent
df.query("expression")Readable string-based conditions✅ Excellent
df.pipe(func)Reusable custom filter functions✅ Excellent
df[df.col.eq(value)]Exact value matching✅ Good
df[df.col.isin(list)]Matching against a set of values✅ Good
df[df.col.str.contains()]Substring/regex matching✅ Good

Conclusion

Pandas method chaining is a powerful technique that produces clean, readable, and maintainable data filtering code:

  • .loc[] with lambda functions is the most versatile approach - chain multiple .loc[] calls for multi-condition filtering.
  • .query() provides the most readable syntax with string-based expressions, ideal for complex conditions.
  • .pipe() enables reusable, composable filter functions that integrate seamlessly into chains.
  • .eq(), .isin(), and .str.contains() handle specific matching scenarios - exact values, sets of values, and substring patterns respectively.

By combining these methods into fluent chains, you can transform raw DataFrames into precisely filtered results in a single, expressive statement - without cluttering your code with intermediate variables.