How to Count Frequency of Column Values in Pandas

Understanding how values are distributed across a column is one of the first steps in any data exploration workflow. Whether you need to know which product sells most frequently, what percentage of users are active, or how values are spread across categories, frequency counting gives you an immediate picture of your data's composition. Pandas provides several methods for this purpose, each suited to different levels of detail and complexity.

In this guide, you will learn how to count value occurrences, calculate proportions, compare frequencies across multiple columns, and handle common scenarios like missing values and rare categories.

Basic Frequency Count with `value_counts()`

The value_counts() method is the standard tool for counting how often each unique value appears in a column. By default, results are sorted in descending order by frequency:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple']
})

print(df['Fruit'].value_counts())

Output:

Fruit
Apple     3
Banana    2
Orange    1
Name: count, dtype: int64

Apple appears 3 times, Banana twice, and Orange once. The most frequent value appears first.

Getting Proportions Instead of Counts

Add normalize=True to see what fraction of the total each value represents:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple']
})

print(df['Fruit'].value_counts(normalize=True))

Output:

Fruit
Apple     0.500000
Banana    0.333333
Orange    0.166667
Name: proportion, dtype: float64

Apple accounts for 50% of all entries. Multiply by 100 if you prefer percentage format:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple']
})

print((df['Fruit'].value_counts(normalize=True) * 100).round(1))

Output:

Fruit
Apple     50.0
Banana    33.3
Orange    16.7
Name: proportion, dtype: float64

Including Missing Values

By default, value_counts() excludes NaN values. Use dropna=False to include them in the count:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Status': ['Active', 'Active', np.nan, 'Inactive', np.nan]
})

print("Excluding NaN (default):")
print(df['Status'].value_counts())
print()

print("Including NaN:")
print(df['Status'].value_counts(dropna=False))

Output:

Excluding NaN (default):
Status
Active      2
Inactive    1
Name: count, dtype: int64

Including NaN:
Status
Active      2
NaN         2
Inactive    1
Name: count, dtype: int64

With dropna=False, you can see that 2 out of 5 records are missing a status, which is valuable information for data quality assessment.

Controlling Sort Order

import pandas as pd

df = pd.DataFrame({'Grade': ['B', 'A', 'C', 'A', 'B', 'A']})

# Sort alphabetically by the value (not by count)
print("Sorted by value:")
print(df['Grade'].value_counts().sort_index())
print()

# Sort by count in ascending order (least common first)
print("Ascending count:")
print(df['Grade'].value_counts(ascending=True))

Output:

Sorted by value:
Grade
A    3
B    2
C    1
Name: count, dtype: int64

Ascending count:
Grade
C    1
B    2
A    3
Name: count, dtype: int64

Cross-Tabulation with `pd.crosstab()`

When you need to compare frequencies across two categorical columns, pd.crosstab() produces a contingency table:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana'],
    'Color': ['Red', 'Yellow', 'Green', 'Orange', 'Yellow']
})

table = pd.crosstab(df['Fruit'], df['Color'])

print(table)

Output:

Color   Green  Orange  Red  Yellow
Fruit                             
Apple       1       0    1       0
Banana      0       0    0       2
Orange      0       1    0       0

Adding Row and Column Totals

The margins=True parameter adds summary totals:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana'],
    'Color': ['Red', 'Yellow', 'Green', 'Orange', 'Yellow']
})

table = pd.crosstab(df['Fruit'], df['Color'], margins=True)

print(table)

Output:

Color   Green  Orange  Red  Yellow  All
Fruit                                  
Apple       1       0    1       0    2
Banana      0       0    0       2    2
Orange      0       1    0       0    1
All         1       1    1       2    5

The "All" row and column show the marginal totals, giving you a complete picture of the distribution.

Multi-Column Frequency with `groupby().size()`

When you need to count combinations of values across multiple columns, groupby().size() provides a flexible approach:

import pandas as pd

df = pd.DataFrame({
    'Category': ['A', 'B', 'A', 'B', 'A'],
    'Status': ['Active', 'Active', 'Inactive', 'Active', 'Active']
})

counts = df.groupby(['Category', 'Status']).size()

print(counts)

Output:

Category  Status  
A         Active      2
          Inactive    1
B         Active      2
dtype: int64

Converting to a Flat DataFrame

For export or further processing, use .reset_index() to flatten the result:

import pandas as pd

df = pd.DataFrame({
    'Category': ['A', 'B', 'A', 'B', 'A'],
    'Status': ['Active', 'Active', 'Inactive', 'Active', 'Active']
})

counts_df = df.groupby(['Category', 'Status']).size().reset_index(name='Count')

print(counts_df)

Output:

  Category    Status  Count
      A    Active      2
      A  Inactive      1
      B    Active      2

Binning Numeric Data into Frequency Groups

For continuous numeric columns, pd.cut() creates bins that you can then count:

import pandas as pd

df = pd.DataFrame({'Age': [22, 35, 45, 23, 38, 52, 28]})

df['Age_Group'] = pd.cut(
    df['Age'],
    bins=[0, 30, 50, 100],
    labels=['Young', 'Middle', 'Senior']
)

print(df['Age_Group'].value_counts())

Output:

Age_Group
Young     3
Middle    3
Senior    1
Name: count, dtype: int64

This is useful for understanding distributions without looking at every individual value.

Practical Examples

Finding the Top N Most Common Values

import pandas as pd

df = pd.DataFrame({
    'Product': ['A', 'B', 'A', 'C', 'A', 'B', 'D', 'A']
})

# Top 3 most common products
top_3 = df['Product'].value_counts().head(3)

print(top_3)

Output:

Product
A    4
B    2
C    1
Name: count, dtype: int6

Filtering Out Rare Values

Sometimes you want to keep only categories that appear frequently enough to be meaningful:

import pandas as pd

df = pd.DataFrame({
    'Category': ['A', 'A', 'A', 'B', 'B', 'C', 'D']
})

counts = df['Category'].value_counts()

# Keep only categories appearing more than once
common = counts[counts > 1].index
df_filtered = df[df['Category'].isin(common)]

print("Original categories:", df['Category'].unique().tolist())
print("After filtering:", df_filtered['Category'].unique().tolist())

Output:

Original categories: ['A', 'B', 'C', 'D']
After filtering: ['A', 'B']

Categories C and D each appear only once and are filtered out.

Building a Complete Frequency Table

import pandas as pd

df = pd.DataFrame({
    'Status': ['Active', 'Active', 'Active', 'Inactive', 'Pending']
})

counts = df['Status'].value_counts()
percentages = df['Status'].value_counts(normalize=True) * 100

freq_table = pd.DataFrame({
    'Count': counts,
    'Percent': percentages.round(1),
    'Cumulative': percentages.cumsum().round(1)
})

print(freq_table)

Output:

          Count  Percent  Cumulative
Status                              
Active        3     60.0        60.0
Inactive      1     20.0        80.0
Pending       1     20.0       100.0

Quick Reference

Method	Best For	Output
`.value_counts()`	Single column frequency	Series
`.value_counts(normalize=True)`	Proportions and percentages	Series
`.value_counts(dropna=False)`	Including missing values	Series
`pd.crosstab()`	Two-column comparison	DataFrame
`.groupby().size()`	Multi-column combinations	Series
`pd.cut()` + `.value_counts()`	Binning numeric data	Series

Use value_counts() for quick single-column frequency analysis.
Add normalize=True for proportions or dropna=False to include missing values in the count.
Use pd.crosstab() to compare two categorical columns side by side.
Use groupby().size() when counting multi-column combinations or when you need to chain the result with other aggregation operations.

Basic Frequency Count with value_counts()​

Getting Proportions Instead of Counts​

Including Missing Values​

Controlling Sort Order​

Cross-Tabulation with pd.crosstab()​

Adding Row and Column Totals​

Multi-Column Frequency with groupby().size()​

Converting to a Flat DataFrame​

Binning Numeric Data into Frequency Groups​

Practical Examples​

Finding the Top N Most Common Values​

Filtering Out Rare Values​

Building a Complete Frequency Table​

Quick Reference​

Table of Contents

Basic Frequency Count with `value_counts()`

Getting Proportions Instead of Counts

Including Missing Values

Controlling Sort Order

Cross-Tabulation with `pd.crosstab()`

Adding Row and Column Totals

Multi-Column Frequency with `groupby().size()`

Converting to a Flat DataFrame

Binning Numeric Data into Frequency Groups

Practical Examples

Finding the Top N Most Common Values

Filtering Out Rare Values

Building a Complete Frequency Table

Quick Reference