Python Pandas: How to Filter Rows Using Pandas Chaining in Python
Method chaining is a programming style where multiple operations are applied to a DataFrame sequentially in a single expression, one after another. Instead of creating intermediate variables for each step, you chain methods together to produce clean, readable, and concise data transformations.
This guide demonstrates how to filter DataFrame rows using Pandas method chaining with various techniques - from simple value matching to complex multi-condition filtering - along with clear examples and outputs.
Creating a Sample DataFrame
Let's start with a sample DataFrame that we'll use throughout this guide:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
print(data)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
1 102 Jack Wills 23 Uk
2 101 Deepanshu Rustagi 20 India
3 106 Thomas James 34 Australia
4 103 Jenny Advekar 18 Uk
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India
Filtering by a Specific Value
Using .eq() for Exact Match
The .eq() method checks for equality and returns a boolean Series. When used inside bracket notation, it filters rows where the condition is True:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Filter rows where Country is "India"
result = data[data.Country.eq("India")]
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India
Using .pipe() for Chainable Filtering
The .pipe() method passes the entire DataFrame to a function, enabling custom filtering logic within a chain:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Filter using pipe: chainable and reusable
result = (
data
.pipe(lambda df: df[df["Country"] == "India"])
)
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India
.pipe() is especially powerful when you define reusable filter functions:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
def filter_by_country(df, country):
return df[df["Country"] == country]
result = data.pipe(filter_by_country, "Uk")
print(result)
Output:
ID Name Age Country
1 102 Jack Wills 23 Uk
4 103 Jenny Advekar 18 Uk
Filtering by Numeric Conditions
Using Boolean Indexing
The most straightforward approach - apply a comparison directly:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Filter rows where Age is less than 30
result = data[data.Age < 30]
print(result)
Output:
ID Name Age Country
1 102 Jack Wills 23 Uk
2 101 Deepanshu Rustagi 20 India
4 103 Jenny Advekar 18 Uk
Chaining .loc[] with Lambda Functions
You can chain multiple .loc[] calls, each using a lambda function to apply a filter condition. This is one of the most expressive forms of Pandas chaining:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Chain multiple conditions: ID <= 103 AND Age == 23
result = (
data
.loc[lambda df: df["ID"] <= 103]
.loc[lambda df: df["Age"] == 23]
)
print(result)
Output:
ID Name Age Country
1 102 Jack Wills 23 Uk
Chaining Multiple Filters with .query()
The .query() method accepts string expressions and is naturally chainable:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Chain multiple query conditions
result = (
data
.query("Age >= 20")
.query("Age <= 40")
.query("Country == 'India'")
)
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
6 107 Raman Dutt Mishra 35 India
You can also combine all conditions in a single .query() call:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
result = data.query("Age >= 20 and Age <= 40 and Country == 'India'")
print(result)
Output is the same:
ID Name Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
6 107 Raman Dutt Mishra 35 India
Filtering by String Patterns
Using .str.contains() for Substring Matching
The .str.contains() method filters rows where a column contains a specific substring:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Filter rows where Name contains "am"
result = data[data.Name.str.contains("am")]
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
3 106 Thomas James 34 Australia
6 107 Raman Dutt Mishra 35 India
Chaining String Filters
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Chain: Names containing "am" AND from India
result = (
data
.loc[lambda df: df["Name"].str.contains("am")]
.loc[lambda df: df["Country"] == "India"]
)
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
6 107 Raman Dutt Mishra 35 India
By default, .str.contains() interprets the pattern as a regular expression. To search for literal strings (especially those containing special regex characters like . or *), set regex=False:
data[data.Name.str.contains("R.", regex=False)] # Literal "R."
Filtering by a Set of Values
Using .isin() for Multiple Value Matching
The .isin() method checks whether each value is in a given list of acceptable values:
# Filter rows where Country is either "Uk" or "Australia"
target_countries = ["Uk", "Australia"]
result = data[data.Country.isin(target_countries)]
print(result)
Output:
ID Name Age Country
1 102 Jack Wills 23 Uk
3 106 Thomas James 34 Australia
4 103 Jenny Advekar 18 Uk
Excluding Values with ~ (Negation)
Invert .isin() to exclude specific values:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
# Filter rows where Country is NOT "Uk" or "Australia"
result = data[~data.Country.isin(["Uk", "Australia"])]
print(result)
Output:
ID Name Age Country
0 105 Ram Kumar 40 India
2 101 Deepanshu Rustagi 20 India
5 104 Yash Raj 56 India
6 107 Raman Dutt Mishra 35 India
Building Complex Chains
The real power of method chaining emerges when you combine filtering, transformation, and aggregation in a single expression:
import pandas as pd
data = pd.DataFrame({
"ID": [105, 102, 101, 106, 103, 104, 107],
"Name": [
"Ram Kumar", "Jack Wills", "Deepanshu Rustagi",
"Thomas James", "Jenny Advekar", "Yash Raj",
"Raman Dutt Mishra"
],
"Age": [40, 23, 20, 34, 18, 56, 35],
"Country": ["India", "Uk", "India", "Australia", "Uk", "India", "India"]
})
result = (
data
.loc[lambda df: df["Country"] == "India"] # Filter by country
.loc[lambda df: df["Age"] > 30] # Filter by age
.assign(Age_Group=lambda df: df["Age"].apply( # Add a new column
lambda x: "Senior" if x >= 50 else "Middle"
))
.sort_values("Age", ascending=False) # Sort by age
.reset_index(drop=True) # Reset the index
)
print(result)
Output:
ID Name Age Country Age_Group
0 104 Yash Raj 56 India Senior
1 105 Ram Kumar 40 India Middle
2 107 Raman Dutt Mishra 35 India Middle
This chain performs four operations without a single intermediate variable - each step feeds its output into the next.
Common Mistake: Chaining on a Copy Warning
When you chain assignments (modifying data), Pandas may raise a SettingWithCopyWarning:
# ❌ May raise a warning: modifying a slice of the original DataFrame
data[data["Country"] == "India"]["Age"] = 0
The correct approach is to use .loc[] for combined filtering and assignment:
# ✅ Correct: no warning
data.loc[data["Country"] == "India", "Age"] = 0
Or use .assign() within a chain to create new columns without modifying the original:
# ✅ Correct: creates a new DataFrame
result = data.assign(Age_Doubled=lambda df: df["Age"] * 2)
Summary of Chaining Methods for Filtering
| Method | Use Case | Chainable |
|---|---|---|
df[condition] | Simple boolean filtering | ⚠️ Limited |
df.loc[lambda df: ...] | Flexible, multi-step filtering | ✅ Excellent |
df.query("expression") | Readable string-based conditions | ✅ Excellent |
df.pipe(func) | Reusable custom filter functions | ✅ Excellent |
df[df.col.eq(value)] | Exact value matching | ✅ Good |
df[df.col.isin(list)] | Matching against a set of values | ✅ Good |
df[df.col.str.contains()] | Substring/regex matching | ✅ Good |
Conclusion
Pandas method chaining is a powerful technique that produces clean, readable, and maintainable data filtering code:
.loc[]with lambda functions is the most versatile approach - chain multiple.loc[]calls for multi-condition filtering..query()provides the most readable syntax with string-based expressions, ideal for complex conditions..pipe()enables reusable, composable filter functions that integrate seamlessly into chains..eq(),.isin(), and.str.contains()handle specific matching scenarios - exact values, sets of values, and substring patterns respectively.
By combining these methods into fluent chains, you can transform raw DataFrames into precisely filtered results in a single, expressive statement - without cluttering your code with intermediate variables.