Python Pandas: How to Count Rows in a Pandas DataFrame
Getting row counts is one of the most fundamental operations in data analysis. Whether you need the total size of a dataset, the number of non-null entries in a specific column, or the count of rows matching a particular condition, Pandas offers multiple approaches optimized for different use cases. Choosing the right method depends on what exactly you need to count and how you plan to use the result.
In this guide, you will learn how to count total rows, non-null values, conditional matches, value frequencies, and group-level counts.
Total Row Count
There are several ways to get the total number of rows in a DataFrame. Each has slightly different characteristics:
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})
# Method 1: len() - fastest and most readable
total = len(df)
print(f"len(df): {total}")
# Method 2: shape - useful when you need both dimensions
rows, cols = df.shape
print(f"df.shape: {rows} rows, {cols} columns")
# Method 3: shape[0] - just the row count
print(f"df.shape[0]: {df.shape[0]}")
Output:
len(df): 4
df.shape: 4 rows, 2 columns
df.shape[0]: 4
| Method | Returns | Best For |
|---|---|---|
len(df) | Integer | General use, fastest and most readable |
df.shape[0] | Integer | When you also need the column count |
df.shape | Tuple | When you need both dimensions at once |
Counting Non-Null Values
The .count() method returns the number of non-null (non-NaN) values in each column, which is different from the total row count when data contains missing values:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC'],
'Sales': [100, np.nan, 150]
})
print("Total rows:", len(df))
print()
print("Non-null per column:")
print(df.count())
Output:
Total rows: 3
Non-null per column:
City 3
Sales 2
dtype: int64
The City column has 3 non-null values (all complete), while Sales has only 2 because one entry is NaN. This distinction is important for understanding data completeness before performing calculations.
Conditional Row Counting
Counting rows that match specific conditions is essential for data exploration and validation.
Single Condition
The fastest approach is to create a boolean mask and sum it, since True counts as 1 and False as 0:
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})
# How many rows are from NYC?
nyc_count = (df['City'] == 'NYC').sum()
print(f"NYC rows: {nyc_count}")
# How many rows have sales above 150?
high_sales = (df['Sales'] > 150).sum()
print(f"High sales rows: {high_sales}")
Output:
NYC rows: 2
High sales rows: 2
An alternative is to filter first and then measure the length, which is slightly less efficient but sometimes more readable:
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})
high_sales = len(df[df['Sales'] > 150])
print(f"High sales rows: {high_sales}")
Output:
High sales rows: 2
Multiple Conditions
Combine conditions using & (AND) and | (OR):
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})
# NYC rows with sales above 100 (AND)
both = ((df['City'] == 'NYC') & (df['Sales'] > 100)).sum()
print(f"NYC with high sales: {both}")
# NYC rows OR sales above 175 (OR)
either = ((df['City'] == 'NYC') | (df['Sales'] > 175)).sum()
print(f"NYC or very high sales: {either}")
Output:
NYC with high sales: 1
NYC or very high sales: 3
Always wrap each condition in parentheses when using & or |. Without parentheses, Python's operator precedence causes & and | to bind before == and >, which produces unexpected errors.
Value Frequencies
The value_counts() method counts how many times each unique value appears in a column, sorted from most to least frequent:
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago', 'LA', 'NYC']
})
print(df['City'].value_counts())
Output:
City
NYC 3
LA 2
Chicago 1
Name: count, dtype: int64
With Percentages
Add normalize=True to see proportions instead of raw counts:
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago', 'LA', 'NYC']
})
pct = df['City'].value_counts(normalize=True) * 100
print(pct.round(1))
Output:
City
NYC 50.0
LA 33.3
Chicago 16.7
Name: proportion, dtype: float64
NYC accounts for half of all entries in the dataset.
Grouped Counting
To count rows within each group, combine groupby() with .size():
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'NYC', 'LA', 'LA'],
'Status': ['Active', 'Inactive', 'Active', 'Active']
})
# Count per city
print(df.groupby('City').size())
print()
# Count per city-status combination
print(df.groupby(['City', 'Status']).size())
Output:
City
LA 2
NYC 2
dtype: int64
City Status
LA Active 2
NYC Active 1
Inactive 1
dtype: int64
Getting a Clean DataFrame from Grouped Counts
import pandas as pd
df = pd.DataFrame({
'City': ['NYC', 'NYC', 'LA', 'LA'],
'Status': ['Active', 'Inactive', 'Active', 'Active']
})
counts = df.groupby(['City', 'Status']).size().reset_index(name='Count')
print(counts)
Output:
City Status Count
0 LA Active 2
1 NYC Active 1
2 NYC Inactive 1
The difference between .size() and .count() in grouped operations is that .size() counts all rows including those with NaN values, while .count() excludes NaN. Use .size() when you want the total number of rows per group regardless of missing data.
Quick Reference
| Goal | Method |
|---|---|
| Total rows | len(df) |
| Dimensions | df.shape returns (rows, cols) |
| Non-null per column | df.count() |
| Conditional count | (df['col'] == value).sum() |
| Multiple conditions | ((cond1) & (cond2)).sum() |
| Value frequencies | df['col'].value_counts() |
| Group counts | df.groupby('col').size() |
- Use
len(df)for the total row count since it is the fastest and most readable option. - Use
df.count()when you need non-null counts per column. For conditional counting, sum boolean masks with(condition).sum()for the best performance. - Use
value_counts()for frequency tables of categorical data, andgroupby().size()for counting rows within each group.