Python Pandas: How to Convert a Pandas Crosstab into a Stacked DataFrame

Crosstabs produce wide-format frequency tables that are great for quick visual inspection, but many analysis tools, visualization libraries like Seaborn, and machine learning pipelines require data in "long" (tidy) format instead. Converting between these formats is a common reshaping operation in data analysis workflows.

In this guide, you will learn how to transform a wide crosstab into a long stacked DataFrame using the .stack() method, flatten the result into a clean table, and reverse the operation when needed.

Creating a Crosstab

A crosstab counts the frequency of value combinations across two or more categorical variables, producing a wide-format table:

import pandas as pd

data = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz', 'BMW', 'Benz'],
    'Fuel': ['Petrol', 'Diesel', 'Diesel', 'Petrol', 'Petrol']
})

ct = pd.crosstab(index=data['Brand'], columns=data['Fuel'])

print(ct)

Output:

Fuel   Diesel  Petrol
Brand                
BMW         1       2
Benz        1       1

This wide format shows one row per brand and one column per fuel type. While readable, many downstream tools expect each observation to occupy its own row.

Stacking to Long Format

The .stack() method moves column headers into row index values, converting the wide table into a long MultiIndex Series:

import pandas as pd

data = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz', 'BMW', 'Benz'],
    'Fuel': ['Petrol', 'Diesel', 'Diesel', 'Petrol', 'Petrol']
})

ct = pd.crosstab(data['Brand'], data['Fuel'])

stacked = ct.stack()

print(stacked)

Output:

Brand  Fuel  
BMW    Diesel    1
       Petrol    2
Benz   Diesel    1
       Petrol    1
dtype: int64

Each brand-fuel combination now occupies its own row. The result is a Series with a MultiIndex, where the former column headers (Diesel, Petrol) have become part of the index.

Resetting the Index for a Flat DataFrame

The stacked result is a MultiIndex Series, which is not always convenient for further processing. Use .reset_index() to convert it into a regular flat DataFrame:

import pandas as pd

data = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz', 'BMW', 'Benz'],
    'Fuel': ['Petrol', 'Diesel', 'Diesel', 'Petrol', 'Petrol']
})

ct = pd.crosstab(data['Brand'], data['Fuel'])
stacked = ct.stack()

# Flatten to a regular DataFrame with a named value column
df_long = stacked.reset_index(name='Count')

print(df_long)

Output:

  Brand    Fuel  Count
 BMW  Diesel      1
 BMW  Petrol      2
Benz  Diesel      1
Benz  Petrol      1

The name='Count' parameter gives the value column a descriptive name. This flat structure is ready for export, visualization, or further analysis.

tip

The complete transformation from crosstab to flat long-format DataFrame can be written as a single chained expression:

df_long = pd.crosstab(data['Brand'], data['Fuel']).stack().reset_index(name='Count')

Handling Multi-Level Crosstabs

When a crosstab has multiple column levels, .stack() operates on the innermost level by default. You can specify which level to stack using the level parameter:

import pandas as pd

data = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz', 'Benz'],
    'Year': [2022, 2023, 2022, 2023],
    'Fuel': ['Petrol', 'Diesel', 'Diesel', 'Petrol']
})

ct = pd.crosstab(
    index=data['Brand'],
    columns=[data['Year'], data['Fuel']]
)

print("Crosstab:")
print(ct)
print()

# Stack the innermost level (Fuel) - default behavior
stacked_fuel = ct.stack()
print("Stacked (innermost level - Fuel):")
print(stacked_fuel)
print()

# Stack a specific level (Year)
stacked_year = ct.stack(level=0)
print("Stacked (level 0 - Year):")
print(stacked_year)

Output:

Crosstab:
Year    2022          2023       
Fuel  Diesel Petrol Diesel Petrol
Brand                            
BMW        0      1      1      0
Benz       1      0      0      1

Stacked (innermost level - Fuel):
Year          2022  2023
Brand Fuel              
BMW   Diesel     0     1
      Petrol     1     0
Benz  Diesel     1     0
      Petrol     0     1

Stacked (level 0 - Year):
Fuel        Diesel  Petrol
Brand Year                
BMW   2022       0       1
      2023       1       0
Benz  2022       1       0
      2023       0       1

Choosing which level to stack gives you control over the final shape of the data.

Filtering Zero Counts After Stacking

Crosstabs include zero counts for combinations that do not appear in the original data. After stacking, you may want to remove these empty combinations:

import pandas as pd

data = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz'],
    'Fuel': ['Petrol', 'Petrol', 'Diesel']
})

ct = pd.crosstab(data['Brand'], data['Fuel'])
df_long = ct.stack().reset_index(name='Count')

print("With zero counts:")
print(df_long)
print()

# Remove rows where Count is 0
df_long = df_long[df_long['Count'] > 0]
print("Without zero counts:")
print(df_long)

Output:

With zero counts:
  Brand    Fuel  Count
0   BMW  Diesel      0
1   BMW  Petrol      2
2  Benz  Diesel      1
3  Benz  Petrol      0

Without zero counts:
  Brand    Fuel  Count
1   BMW  Petrol      2
2  Benz  Diesel      1

Unstacking: The Reverse Operation

The .unstack() method performs the opposite transformation, converting long format back to wide format:

import pandas as pd

# Start with long format data
df_long = pd.DataFrame({
    'Brand': ['BMW', 'BMW', 'Benz', 'Benz'],
    'Fuel': ['Diesel', 'Petrol', 'Diesel', 'Petrol'],
    'Count': [1, 2, 1, 1]
})

# Set the index and unstack
wide = df_long.set_index(['Brand', 'Fuel'])['Count'].unstack()

print(wide)

Output:

Fuel   Diesel  Petrol
Brand                
BMW         1       2
Benz        1       1

This produces the same wide format as the original crosstab, completing the round trip between wide and long formats.

Quick Reference

Method	Direction	Use Case
`.stack()`	Wide to long	Prepare data for plotting or ML
`.unstack()`	Long to wide	Create pivot-style summary tables
`.reset_index(name='col')`	Flatten MultiIndex	Export to CSV, Excel, or databases
`.stack(level=n)`	Stack specific level	Multi-level column control

Use .stack() to transform wide crosstab data into long format, which is the structure expected by most visualization libraries and machine learning pipelines.
Follow with .reset_index(name='column') to produce a clean, flat DataFrame ready for export or further processing.
Use .unstack() when you need to reverse the transformation and return to wide format.

Creating a Crosstab​

Stacking to Long Format​

Resetting the Index for a Flat DataFrame​

Handling Multi-Level Crosstabs​

Filtering Zero Counts After Stacking​

Unstacking: The Reverse Operation​

Quick Reference​

Table of Contents

Creating a Crosstab

Stacking to Long Format

Resetting the Index for a Flat DataFrame

Handling Multi-Level Crosstabs

Filtering Zero Counts After Stacking

Unstacking: The Reverse Operation

Quick Reference