Skip to main content

Python Pandas: How to Concatenate MultiIndex into Single Index in Pandas

Hierarchical (multi-level) indices in Pandas are powerful for representing structured data, but they can complicate display, export, and downstream processing. Flattening a MultiIndex into a single descriptive string index simplifies these tasks by combining the levels into readable labels like "2023_Q1" or "Sales_Revenue". This operation is common after groupby aggregations, pivot tables, or any operation that produces a MultiIndex.

In this guide, you will learn how to flatten both row and column MultiIndexes using different techniques, handle non-string data types, and choose the right approach for your formatting needs.

Using map() with join() for Fast Flattening

The most efficient method for MultiIndexes where all levels contain strings is to use .map() with a join function:

import pandas as pd

# Create a DataFrame with a MultiIndex
index = pd.MultiIndex.from_tuples([
('2023', 'Q1'),
('2023', 'Q2'),
('2024', 'Q1')
])
df = pd.DataFrame({'Sales': [100, 150, 200]}, index=index)

print("Before:")
print(df)
print()

# Flatten with underscore separator
df.index = df.index.map('_'.join)

print("After:")
print(df)

Output:

Before:
Sales
2023 Q1 100
Q2 150
2024 Q1 200

After:
Sales
2023_Q1 100
2023_Q2 150
2024_Q1 200

The '_'.join function is called on each tuple in the MultiIndex, joining the level values with an underscore. You can use any separator string.

Different Separator Options

import pandas as pd

index = pd.MultiIndex.from_tuples([('2023', 'Q1'), ('2023', 'Q2')])
df = pd.DataFrame({'Value': [100, 200]}, index=index)

# Various separators
print(index.map(' - '.join)) # ['2023 - Q1', '2023 - Q2']
print(index.map('/'.join)) # ['2023/Q1', '2023/Q2']
print(index.map('.'.join)) # ['2023.Q1', '2023.Q2']

Output:

Index(['2023 - Q1', '2023 - Q2'], dtype='object')
Index(['2023/Q1', '2023/Q2'], dtype='object')
Index(['2023.Q1', '2023.Q2'], dtype='object')

Handling Non-String Data Types

When MultiIndex levels contain numbers or other non-string types, the join() method will raise a TypeError. You need to convert the values to strings first:

import pandas as pd

index = pd.MultiIndex.from_tuples([
('Category', 101),
('Category', 102),
('Category', 103)
])
df = pd.DataFrame({'Value': [10, 20, 30]}, index=index)

# This fails because 101, 102, 103 are integers
try:
df.index = df.index.map('_'.join)
except TypeError as e:
print(f"Error: {e}")

Output:

Error: sequence item 1: expected str instance, int found

The fix is to convert each element to a string before joining:

import pandas as pd

index = pd.MultiIndex.from_tuples([
('Category', 101),
('Category', 102),
('Category', 103)
])
df = pd.DataFrame({'Value': [10, 20, 30]}, index=index)

df.index = df.index.map(lambda x: '_'.join(map(str, x)))

print(df)

Output:

              Value
Category_101 10
Category_102 20
Category_103 30
tip

Using lambda x: '_'.join(map(str, x)) is a safe universal approach that works regardless of the data types in your MultiIndex levels. If you know all levels are strings, the simpler '_'.join is faster.

Custom Formatting with f-strings

For more complex formatting where you need different treatment for each level, use a lambda with an f-string:

import pandas as pd

index = pd.MultiIndex.from_tuples([
('Alice', 'NYC'),
('Bob', 'LA'),
('Charlie', 'Chicago')
])
df = pd.DataFrame({'Score': [85, 92, 78]}, index=index)

# Custom formatted index
df.index = df.index.map(lambda x: f"{x[0]} ({x[1]})")

print(df)

Output:

                   Score
Alice (NYC) 85
Bob (LA) 92
Charlie (Chicago) 78
note

This approach gives you complete control over how each level is incorporated into the final string. You can reference levels by position (x[0], x[1], etc.) and apply any formatting you need.

Flattening Column MultiIndex

The same techniques work for hierarchical column names, which commonly appear after pivot table operations or multi-level aggregations:

import pandas as pd

# Create a DataFrame with MultiIndex columns
columns = pd.MultiIndex.from_tuples([
('Sales', 'Q1'),
('Sales', 'Q2'),
('Profit', 'Q1')
])
df = pd.DataFrame([[100, 150, 20], [200, 250, 40]], columns=columns)

print("Before:")
print(df)
print()

# Flatten column names
df.columns = df.columns.map('_'.join)

print("After:")
print(df)

Output:

Before:
Sales Profit
Q1 Q2 Q1
0 100 150 20
1 200 250 40

After:
Sales_Q1 Sales_Q2 Profit_Q1
0 100 150 20
1 200 250 40

Flattened column names are much easier to reference in subsequent code (df['Sales_Q1'] instead of df[('Sales', 'Q1')]) and export cleanly to CSV files.

Selecting Specific Levels with get_level_values()

When your MultiIndex has three or more levels and you only want to combine some of them, use get_level_values() to extract individual levels:

import pandas as pd

index = pd.MultiIndex.from_tuples([
('2023', 'Jan', 'Week1'),
('2023', 'Jan', 'Week2'),
('2023', 'Feb', 'Week1')
])
df = pd.DataFrame({'Sales': [100, 120, 150]}, index=index)

# Combine only the first two levels, ignoring the third
level0 = df.index.get_level_values(0)
level1 = df.index.get_level_values(1)
df.index = level0 + '_' + level1

print(df)

Output:

          Sales
2023_Jan 100
2023_Jan 120
2023_Feb 150

This gives you precise control over which levels are included in the flattened index and which are discarded.

Practical Example: Flattening After GroupBy

A common scenario where MultiIndex flattening is needed is after a groupby aggregation with multiple levels:

import pandas as pd

df = pd.DataFrame({
'Region': ['East', 'East', 'West', 'West'],
'Product': ['A', 'B', 'A', 'B'],
'Sales': [100, 150, 200, 250],
'Profit': [10, 20, 30, 40]
})

# GroupBy produces a MultiIndex
grouped = df.groupby(['Region', 'Product']).sum()
print("After groupby (MultiIndex):")
print(grouped)
print()

# Flatten for easier access
grouped.index = grouped.index.map(lambda x: '_'.join(x))
print("After flattening:")
print(grouped)

Output:

After groupby (MultiIndex):
Sales Profit
Region Product
East A 100 10
B 150 20
West A 200 30
B 250 40

After flattening:
Sales Profit
East_A 100 10
East_B 150 20
West_A 200 30
West_B 250 40
info

An alternative to flattening after groupby is to use as_index=False in the groupby() call itself, which returns a regular DataFrame with the grouping columns as regular columns instead of an index. Choose flattening when you want a descriptive single index; choose as_index=False when you want the grouping values as separate columns.

Quick Reference

MethodBest ForHandles Mixed Types
index.map('_'.join)Simple string joinsNo
index.map(lambda x: '_'.join(map(str, x)))Mixed data typesYes
f-string lambdaCustom formattingYes
get_level_values()Selecting specific levelsYes
  • Use index.map('_'.join) for fast flattening when all levels are strings.
  • For mixed data types, wrap with map(str, x) to convert all values to strings first.
  • Use f-string lambdas when you need custom formatting beyond simple concatenation.
  • All of these techniques apply equally to column MultiIndexes by operating on df.columns instead of df.index.