Python Pandas: How to Collapse or Combine Multiple Columns in Pandas

Merging multiple columns into a single column is a common operation in data transformation. You might need to create totals from quarterly figures, combine first and last names into a full name, or reshape wide data into a long format for visualization. Pandas provides several approaches depending on whether you are working with numeric data, text data, or restructuring the DataFrame layout entirely.

In this guide, you will learn how to sum and aggregate numeric columns, concatenate string columns while handling missing values, reshape wide DataFrames into long format, and apply custom combination logic.

Summing Numeric Columns

The simplest way to combine numeric columns is to select them and sum across each row using axis=1:

import pandas as pd

df = pd.DataFrame({
    'Q1': [100, 150],
    'Q2': [120, 180],
    'Q3': [130, 190]
})

# Sum all three quarters into a total
df['Total'] = df[['Q1', 'Q2', 'Q3']].sum(axis=1)

# Sum only the first two quarters
df['H1_Total'] = df[['Q1', 'Q2']].sum(axis=1)

print(df)

Output:

    Q1   Q2   Q3  Total  H1_Total
0  100  120  130    350       220
1  150  180  190    520       330

The axis=1 parameter tells Pandas to sum across columns (horizontally) rather than down rows.

Collapsing Columns by Category with `groupby()`

When you have multiple columns that belong to the same logical category, such as quarterly sales and quarterly returns, you can collapse them into aggregated totals using a column mapping:

import pandas as pd

df = pd.DataFrame({
    'ID': [1, 2],
    'Q1_Sales': [100, 150],
    'Q2_Sales': [120, 180],
    'Q1_Returns': [10, 15],
    'Q2_Returns': [8, 12]
})

# Map each column to its target category
mapping = {
    'Q1_Sales': 'Total_Sales',
    'Q2_Sales': 'Total_Sales',
    'Q1_Returns': 'Total_Returns',
    'Q2_Returns': 'Total_Returns'
}

collapsed = df.set_index('ID').groupby(mapping, axis=1).sum().reset_index()

print(collapsed)

Output:

   ID  Total_Returns  Total_Sales
0   1             18          220
1   2             27          330

The mapping dictionary tells groupby() which columns should be combined together. All columns mapped to 'Total_Sales' are summed, and all columns mapped to 'Total_Returns' are summed independently.

info

The axis=1 parameter in groupby() has been deprecated in recent Pandas versions. For newer versions, you can achieve the same result by transposing, grouping, and transposing back, or by manually selecting and summing the relevant column groups.

Concatenating String Columns

For text columns, use the + operator to join values with a separator:

import pandas as pd

df = pd.DataFrame({
    'First': ['John', 'Jane'],
    'Last': ['Doe', 'Smith'],
    'Title': ['Mr.', 'Dr.']
})

# Simple two-column concatenation
df['Full_Name'] = df['First'] + ' ' + df['Last']

# Combining three columns
df['Formal'] = df['Title'] + ' ' + df['First'] + ' ' + df['Last']

print(df)

Output:

  First   Last Title   Full_Name          Formal
0  John    Doe   Mr.    John Doe    Mr. John Doe
1  Jane  Smith   Dr.  Jane Smith  Dr. Jane Smith

Handling Missing Values in String Concatenation

When any of the columns contain NaN, the + operator propagates the missing value and produces NaN in the result. You need to handle this explicitly:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'First': ['John', 'Jane'],
    'Middle': [np.nan, 'Marie'],
    'Last': ['Doe', 'Smith']
})

# Without handling NaN: the result for John would be NaN
print("Without fillna:")
print(df['First'] + ' ' + df['Middle'] + ' ' + df['Last'])
print()

# With fillna: replace NaN with empty string before concatenating
df['Full'] = (df['First'] + ' ' +
              df['Middle'].fillna('') + ' ' +
              df['Last'])

print("With fillna:")
print(df['Full'])

Output:

Without fillna:
0                 NaN
1    Jane Marie Smith
dtype: object

With fillna:
0           John  Doe
1    Jane Marie Smith
Name: Full, dtype: object

For a cleaner approach that avoids double spaces when middle names are missing, use .agg() with a join function:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'First': ['John', 'Jane'],
    'Middle': [np.nan, 'Marie'],
    'Last': ['Doe', 'Smith']
})

# Join non-empty values with a space
df['Full'] = df[['First', 'Middle', 'Last']].fillna('').agg(
    lambda row: ' '.join(part for part in row if part), axis=1
)

print(df)

Output:

  First Middle   Last              Full
0  John    NaN    Doe          John Doe
1  Jane  Marie  Smith  Jane Marie Smith

Reshaping Wide to Long Format with `melt()`

Sometimes "combining columns" really means restructuring the data so that column headers become values in a single column. The melt() function converts wide format to long format, which is often required for visualization libraries like Seaborn and Plotly:

import pandas as pd

df = pd.DataFrame({
    'Year': [2023, 2024],
    'Q1': [100, 150],
    'Q2': [120, 180],
    'Q3': [130, 190],
    'Q4': [140, 200]
})

melted = df.melt(
    id_vars=['Year'],
    var_name='Quarter',
    value_name='Sales'
)

print(melted)

Output:

   Year Quarter  Sales
2023      Q1    100
2024      Q1    150
2023      Q2    120
2024      Q2    180
2023      Q3    130
2024      Q3    190
2023      Q4    140
2024      Q4    200

The four quarter columns have been collapsed into two columns: Quarter (holding the former column names) and Sales (holding the values). The Year column is preserved as an identifier.

Aggregating Across Columns with `.agg()`

When you need different aggregation functions beyond a simple sum, select the columns and apply the desired function:

import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
})

cols = ['A', 'B', 'C']

df['Sum'] = df[cols].sum(axis=1)
df['Mean'] = df[cols].mean(axis=1)
df['Range'] = df[cols].max(axis=1) - df[cols].min(axis=1)

print(df)

Output:

   A  B  C  Sum  Mean  Range
1  4  7   12   4.0      6
2  5  8   15   5.0      6
3  6  9   18   6.0      6

Using `.apply()` for Complex Combination Logic

When the combination logic involves formatting or conditional rules that cannot be expressed with simple operators, use .apply() with axis=1:

import pandas as pd

df = pd.DataFrame({
    'City': ['New York', 'Los Angeles'],
    'State': ['NY', 'CA'],
    'Zip': ['10001', '90001']
})

df['Location'] = df.apply(
    lambda row: f"{row['City']}, {row['State']} {row['Zip']}",
    axis=1
)

print(df)

Output:

          City State    Zip               Location
0     New York    NY  10001     New York, NY 10001
1  Los Angeles    CA  90001  Los Angeles, CA 90001

tip

Reserve .apply() for cases where the combination logic is too complex for vectorized operations. For simple string concatenation or arithmetic, the + operator and .sum() are significantly faster on large datasets.

Quick Reference

Goal	Method	Example
Sum columns	`.sum(axis=1)`	`df[cols].sum(axis=1)`
Collapse by category	`groupby(mapping, axis=1)`	Map subcategories to totals
Join strings	`+` operator	`df['A'] + ' ' + df['B']`
Join strings (with NaN)	`.fillna('') + ...`	Handle missing before joining
Reshape wide to long	`.melt()`	Column headers become row values
Custom aggregation	`.mean()`, `.max()`, `.min()`	`df[cols].mean(axis=1)`
Complex combination logic	`.apply(func, axis=1)`	Formatted strings, conditionals

Use df[cols].sum(axis=1) for straightforward numeric totals.
Use string concatenation with + for merging text columns, adding .fillna('') when missing values are possible.
Use melt() to reshape wide data into long format for visualization.

For complex multi-column logic that does not fit neatly into vectorized operations, use .apply() with axis=1.

Summing Numeric Columns​

Collapsing Columns by Category with groupby()​

Concatenating String Columns​

Handling Missing Values in String Concatenation​

Reshaping Wide to Long Format with melt()​

Aggregating Across Columns with .agg()​

Using .apply() for Complex Combination Logic​

Quick Reference​

Table of Contents

Summing Numeric Columns

Collapsing Columns by Category with `groupby()`

Concatenating String Columns

Handling Missing Values in String Concatenation

Reshaping Wide to Long Format with `melt()`

Aggregating Across Columns with `.agg()`

Using `.apply()` for Complex Combination Logic

Quick Reference