Skip to main content

Python Pandas: How to Count Rows in a Pandas DataFrame

Getting row counts is one of the most fundamental operations in data analysis. Whether you need the total size of a dataset, the number of non-null entries in a specific column, or the count of rows matching a particular condition, Pandas offers multiple approaches optimized for different use cases. Choosing the right method depends on what exactly you need to count and how you plan to use the result.

In this guide, you will learn how to count total rows, non-null values, conditional matches, value frequencies, and group-level counts.

Total Row Count

There are several ways to get the total number of rows in a DataFrame. Each has slightly different characteristics:

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})

# Method 1: len() - fastest and most readable
total = len(df)
print(f"len(df): {total}")

# Method 2: shape - useful when you need both dimensions
rows, cols = df.shape
print(f"df.shape: {rows} rows, {cols} columns")

# Method 3: shape[0] - just the row count
print(f"df.shape[0]: {df.shape[0]}")

Output:

len(df): 4
df.shape: 4 rows, 2 columns
df.shape[0]: 4
MethodReturnsBest For
len(df)IntegerGeneral use, fastest and most readable
df.shape[0]IntegerWhen you also need the column count
df.shapeTupleWhen you need both dimensions at once

Counting Non-Null Values

The .count() method returns the number of non-null (non-NaN) values in each column, which is different from the total row count when data contains missing values:

import pandas as pd
import numpy as np

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC'],
'Sales': [100, np.nan, 150]
})

print("Total rows:", len(df))
print()
print("Non-null per column:")
print(df.count())

Output:

Total rows: 3

Non-null per column:
City 3
Sales 2
dtype: int64

The City column has 3 non-null values (all complete), while Sales has only 2 because one entry is NaN. This distinction is important for understanding data completeness before performing calculations.

Conditional Row Counting

Counting rows that match specific conditions is essential for data exploration and validation.

Single Condition

The fastest approach is to create a boolean mask and sum it, since True counts as 1 and False as 0:

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})

# How many rows are from NYC?
nyc_count = (df['City'] == 'NYC').sum()
print(f"NYC rows: {nyc_count}")

# How many rows have sales above 150?
high_sales = (df['Sales'] > 150).sum()
print(f"High sales rows: {high_sales}")

Output:

NYC rows: 2
High sales rows: 2

An alternative is to filter first and then measure the length, which is slightly less efficient but sometimes more readable:

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})

high_sales = len(df[df['Sales'] > 150])
print(f"High sales rows: {high_sales}")

Output:

High sales rows: 2

Multiple Conditions

Combine conditions using & (AND) and | (OR):

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago'],
'Sales': [100, 200, 150, 175]
})

# NYC rows with sales above 100 (AND)
both = ((df['City'] == 'NYC') & (df['Sales'] > 100)).sum()
print(f"NYC with high sales: {both}")

# NYC rows OR sales above 175 (OR)
either = ((df['City'] == 'NYC') | (df['Sales'] > 175)).sum()
print(f"NYC or very high sales: {either}")

Output:

NYC with high sales: 1
NYC or very high sales: 3
tip

Always wrap each condition in parentheses when using & or |. Without parentheses, Python's operator precedence causes & and | to bind before == and >, which produces unexpected errors.

Value Frequencies

The value_counts() method counts how many times each unique value appears in a column, sorted from most to least frequent:

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago', 'LA', 'NYC']
})

print(df['City'].value_counts())

Output:

City
NYC 3
LA 2
Chicago 1
Name: count, dtype: int64

With Percentages

Add normalize=True to see proportions instead of raw counts:

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'LA', 'NYC', 'Chicago', 'LA', 'NYC']
})

pct = df['City'].value_counts(normalize=True) * 100

print(pct.round(1))

Output:

City
NYC 50.0
LA 33.3
Chicago 16.7
Name: proportion, dtype: float64

NYC accounts for half of all entries in the dataset.

Grouped Counting

To count rows within each group, combine groupby() with .size():

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'NYC', 'LA', 'LA'],
'Status': ['Active', 'Inactive', 'Active', 'Active']
})

# Count per city
print(df.groupby('City').size())
print()

# Count per city-status combination
print(df.groupby(['City', 'Status']).size())

Output:

City
LA 2
NYC 2
dtype: int64

City Status
LA Active 2
NYC Active 1
Inactive 1
dtype: int64

Getting a Clean DataFrame from Grouped Counts

import pandas as pd

df = pd.DataFrame({
'City': ['NYC', 'NYC', 'LA', 'LA'],
'Status': ['Active', 'Inactive', 'Active', 'Active']
})


counts = df.groupby(['City', 'Status']).size().reset_index(name='Count')

print(counts)

Output:

  City    Status  Count
0 LA Active 2
1 NYC Active 1
2 NYC Inactive 1
info

The difference between .size() and .count() in grouped operations is that .size() counts all rows including those with NaN values, while .count() excludes NaN. Use .size() when you want the total number of rows per group regardless of missing data.

Quick Reference

GoalMethod
Total rowslen(df)
Dimensionsdf.shape returns (rows, cols)
Non-null per columndf.count()
Conditional count(df['col'] == value).sum()
Multiple conditions((cond1) & (cond2)).sum()
Value frequenciesdf['col'].value_counts()
Group countsdf.groupby('col').size()
  • Use len(df) for the total row count since it is the fastest and most readable option.
  • Use df.count() when you need non-null counts per column. For conditional counting, sum boolean masks with (condition).sum() for the best performance.
  • Use value_counts() for frequency tables of categorical data, and groupby().size() for counting rows within each group.