Skip to main content

Python Pandas: How to Select Rows from a DataFrame Based on Column Values in Python

When working with tabular data in Python, one of the most common tasks is filtering rows based on the values in one or more columns. Whether you need to isolate records that meet a specific threshold, match a category, or satisfy a combination of conditions, Pandas offers several powerful and expressive ways to accomplish this.

This guide walks you through the most practical methods for selecting rows from a Pandas DataFrame based on column values, from simple boolean indexing to SQL-like queries, complete with examples and outputs.

Setting Up the Sample DataFrame

All examples in this article use the following DataFrame:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Age         City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston
4 Eva 29 New York

Boolean Indexing (The Most Common Approach)

Boolean indexing is the most widely used technique for filtering rows in Pandas. You pass a condition - which evaluates to a Series of True/False values - directly into the bracket operator of the DataFrame.

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

selected = df[df["Age"] > 25]
print(selected)

Output:

    Name  Age         City
1 Bob 27 Los Angeles
3 David 32 Houston
4 Eva 29 New York

Every row where the condition df["Age"] > 25 is True is included in the result.

Combining Multiple Conditions

You can chain multiple conditions using the bitwise operators & (AND), | (OR), and ~ (NOT). Each individual condition must be wrapped in parentheses; otherwise Python's operator precedence will raise an error.

Common Mistake - Missing Parentheses
# ❌ This raises a ValueError
result = df[df["Age"] > 25 & df["City"] == "New York"]

Because & has higher precedence than > and ==, the expression is evaluated incorrectly.

The correct way is to wrap each condition in parentheses:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df[(df["Age"] > 25) & (df["City"] == "New York")]
print(result)

Output:

  Name  Age      City
4 Eva 29 New York

You can also use | for OR logic:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df[(df["Age"] < 24) | (df["City"] == "Houston")]
print(result)

Output:

      Name  Age     City
2 Charlie 22 Chicago
3 David 32 Houston

Using loc for Conditional Row Selection

The loc accessor selects rows by label or boolean array. It works just like bracket-based boolean indexing but offers an important advantage: you can simultaneously filter rows and select specific columns.

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df.loc[df["City"] == "Chicago"]
print(result)

Output:

      Name  Age     City
2 Charlie 22 Chicago

Filtering Rows and Selecting Columns at the Same Time

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df.loc[df["Age"] >= 27, ["Name", "City"]]
print(result)

Output:

    Name         City
1 Bob Los Angeles
3 David Houston
4 Eva New York
When to prefer loc

Use loc when you want to filter rows and restrict columns in a single, readable expression. It also avoids the chained-indexing pitfalls that can lead to the SettingWithCopyWarning.

Using the query Method for SQL-Like Filtering

The query method lets you write filter conditions as a string expression, which can feel more natural - especially if you have a SQL background. It also tends to produce cleaner code when conditions are complex.

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df.query("Age < 30")
print(result)

Output:

      Name  Age         City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
4 Eva 29 New York

Referencing External Variables with @

You can reference Python variables inside a query string by prefixing them with @:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)


min_age = 25
target_city = "New York"

result = df.query("Age > @min_age and City == @target_city")
print(result)

Output:

  Name  Age      City
4 Eva 29 New York

Using isin for Membership-Based Selection

When you need to check whether column values belong to a set of allowed values, the isin method is the cleanest option. It replaces long chains of OR conditions with a single, readable call.

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df[df["City"].isin(["New York", "Chicago"])]
print(result)

Output:

      Name  Age      City
0 Alice 24 New York
2 Charlie 22 Chicago
4 Eva 29 New York

To exclude certain values, negate the condition with ~:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eva"],
"Age": [24, 27, 22, 32, 29],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "New York"],
}
df = pd.DataFrame(data)

result = df[~df["City"].isin(["Los Angeles", "Houston"])]
print(result)

Output:

      Name  Age      City
0 Alice 24 New York
2 Charlie 22 Chicago
4 Eva 29 New York

Quick Comparison of Methods

MethodBest ForSupports Column SelectionString Syntax
df[condition]Simple, fast filteringNoNo
df.loc[condition]Filtering + column selectionYesNo
df.query("...")Complex or SQL-like conditionsNo (directly)Yes
df[df["col"].isin([...])]Matching against a list of valuesNoNo

Summary

Selecting rows based on column values is a foundational skill for any data analysis workflow in Python. Here is a quick recap of when to use each method:

  • Boolean indexing (df[condition]) - your go-to for straightforward filters.
  • loc - ideal when you need to filter rows and pick specific columns in one step.
  • query - great for readability, especially with multiple conditions or external variables.
  • isin - the best choice when checking membership against a list of values.

All four approaches are performant for typical dataset sizes. Choose the one that makes your code the most readable and maintainable for the task at hand.