Python Pandas: How to Get All Combinations of Two Columns in a Pandas DataFrame

When working with data analysis, you sometimes need to generate all possible combinations (the Cartesian product) between the values of two columns. This is useful for scenarios like pairing participants in experiments, creating feature combinations for machine learning, generating test cases, or building comparison matrices.

In this guide, you will learn how to compute all combinations of two DataFrame columns using Python's itertools.product(), as well as alternative methods using Pandas' built-in merge() with a cross join.

Setting Up the Example

import pandas as pd

df = pd.DataFrame({
    'gents': ['Michael', 'Daniel'],
    'ladies': ['Emily', 'Olivia']
})

print(df)

Output:

     gents  ladies
0  Michael   Emily
1   Daniel  Olivia

Our goal is to generate every possible pairing between the gents column and the ladies column - that is, all 2 × 2 = 4 combinations.

Method 1: Using `itertools.product()`

The itertools.product() function computes the Cartesian product of input iterables - every element from the first iterable is paired with every element from the second:

import pandas as pd
from itertools import product

df = pd.DataFrame({
    'gents': ['Michael', 'Daniel'],
    'ladies': ['Emily', 'Olivia']
})

# Generate all combinations
combinations = list(product(df['gents'], df['ladies']))

print("All combinations:")
for combo in combinations:
    print(combo)

Output:

All combinations:
('Michael', 'Emily')
('Michael', 'Olivia')
('Daniel', 'Emily')
('Daniel', 'Olivia')

Each value from gents is paired with every value from ladies.

Converting the Result to a DataFrame

To work with the combinations as tabular data, convert the list of tuples to a DataFrame:

import pandas as pd
from itertools import product

df = pd.DataFrame({
    'gents': ['Michael', 'Daniel'],
    'ladies': ['Emily', 'Olivia']
})

combinations = list(product(df['gents'], df['ladies']))
result = pd.DataFrame(combinations, columns=['gents', 'ladies'])

print(result)

Output:

     gents  ladies
Michael   Emily
Michael  Olivia
 Daniel   Emily
 Daniel  Olivia

Method 2: Using Pandas `merge()` With a Cross Join

Pandas provides a built-in cross join (available since Pandas 1.2.0) that computes the Cartesian product directly:

import pandas as pd

df = pd.DataFrame({
    'gents': ['Michael', 'Daniel'],
    'ladies': ['Emily', 'Olivia']
})

# Create separate DataFrames for each column
gents_df = df[['gents']]
ladies_df = df[['ladies']]

# Cross join produces all combinations
result = gents_df.merge(ladies_df, how='cross')

print(result)

Output:

     gents  ladies
Michael   Emily
Michael  Olivia
 Daniel   Emily
 Daniel  Olivia

tip

The how='cross' parameter was introduced in Pandas 1.2.0. If you are using an older version, you can simulate it by adding a temporary key column:

# For Pandas < 1.2.0
gents_df = df[['gents']].assign(key=1)
ladies_df = df[['ladies']].assign(key=1)
result = gents_df.merge(ladies_df, on='key').drop('key', axis=1)

Method 3: Using `pd.MultiIndex.from_product()`

For generating combinations as a MultiIndex (useful for creating matrices or pivot-like structures):

import pandas as pd

df = pd.DataFrame({
    'gents': ['Michael', 'Daniel'],
    'ladies': ['Emily', 'Olivia']
})

# Create a MultiIndex from the Cartesian product
index = pd.MultiIndex.from_product(
    [df['gents'], df['ladies']],
    names=['gents', 'ladies']
)

result = pd.DataFrame(index=index).reset_index()
print(result)

Output:

     gents  ladies
Michael   Emily
Michael  Olivia
 Daniel   Emily
 Daniel  Olivia

Practical Example: Student-Course Enrollment

Generate all possible student-course combinations to identify which enrollments are missing:

import pandas as pd
from itertools import product

students = pd.DataFrame({
    'student_id': [101, 102, 103],
    'student_name': ['Alice', 'Bob', 'Charlie']
})

courses = pd.DataFrame({
    'course_id': ['CS101', 'MATH201'],
    'course_name': ['Intro to CS', 'Calculus']
})

# All possible enrollments
all_combos = list(product(students['student_name'], courses['course_name']))
enrollment_matrix = pd.DataFrame(all_combos, columns=['Student', 'Course'])

print(enrollment_matrix)

Output:

   Student       Course
  Alice  Intro to CS
  Alice     Calculus
    Bob  Intro to CS
    Bob     Calculus
Charlie  Intro to CS
Charlie     Calculus

Combinations of a Column With Itself

To generate all pairs from a single column (e.g., for round-robin matchups), use itertools.combinations() to avoid pairing an element with itself:

from itertools import combinations
import pandas as pd

df = pd.DataFrame({'players': ['Alice', 'Bob', 'Charlie', 'Diana']})

# All unique pairs (no self-pairing, no duplicates)
matchups = list(combinations(df['players'], 2))
result = pd.DataFrame(matchups, columns=['Player 1', 'Player 2'])

print(result)

Output:

  Player 1 Player 2
  Alice      Bob
  Alice  Charlie
  Alice    Diana
    Bob  Charlie
    Bob    Diana
Charlie    Diana

product() vs. combinations() vs. permutations()

Function	Self-pairing	Order matters	Example for `[A, B]`
`product(col, col)`	✅ Yes	✅ Yes	`(A,A), (A,B), (B,A), (B,B)`
`combinations(col, 2)`	❌ No	❌ No	`(A,B)`
`permutations(col, 2)`	❌ No	✅ Yes	`(A,B), (B,A)`

Choose the function that matches your specific pairing requirements.

Performance Considerations

The Cartesian product grows multiplicatively. If column A has m values and column B has n values, the result has m × n rows:

Column A Size	Column B Size	Result Rows
10	10	100
100	100	10,000
1,000	1,000	1,000,000
10,000	10,000	100,000,000

Watch out for memory with large columns

For large columns, the cross product can consume significant memory. Consider:

Filtering before generating combinations (reduce input size).
Processing in chunks using itertools.product() as a lazy iterator instead of converting to a list.
Using database joins if the data is in a database.

from itertools import product

# ✅ Lazy: processes one combo at a time, no memory spike
for gent, lady in product(df['gents'], df['ladies']):
    process(gent, lady)

# ❌ Eager: loads ALL combos into memory at once
all_combos = list(product(df['gents'], df['ladies']))

Comparison of Methods

Method	Returns	Best For
`itertools.product()`	List of tuples	General-purpose, works with any iterables
`merge(how='cross')`	DataFrame	Staying within Pandas, clean DataFrame output
`MultiIndex.from_product()`	MultiIndex / DataFrame	Creating structured indices or pivot tables

Conclusion

Generating all combinations of two DataFrame columns is straightforward in Python.

itertools.product() is the most versatile approach and works with any iterable, while pd.merge(how='cross') keeps everything within the Pandas ecosystem and produces a DataFrame directly.
For structured indexing scenarios, pd.MultiIndex.from_product() provides an elegant solution.

Whichever method you choose, be mindful of the output size: the Cartesian product grows multiplicatively, so always consider filtering your data before generating combinations when working with large datasets.

Setting Up the Example​

Method 1: Using itertools.product()​

Converting the Result to a DataFrame​

Method 2: Using Pandas merge() With a Cross Join​

Method 3: Using pd.MultiIndex.from_product()​

Practical Example: Student-Course Enrollment​

Combinations of a Column With Itself​

Performance Considerations​

Comparison of Methods​

Conclusion​

Table of Contents

Setting Up the Example

Method 1: Using `itertools.product()`

Converting the Result to a DataFrame

Method 2: Using Pandas `merge()` With a Cross Join

Method 3: Using `pd.MultiIndex.from_product()`

Practical Example: Student-Course Enrollment

Combinations of a Column With Itself

Performance Considerations

Comparison of Methods

Conclusion