Python Pandas: How to Find Unique Values From Multiple Columns in Pandas

When analyzing data in Pandas, you often need to extract unique values - not just from a single column, but across multiple columns simultaneously. This is useful for tasks like identifying all distinct categories, finding all unique names in a dataset, or building dropdown options from several fields.

In this guide, you will learn multiple methods to extract unique values from multiple DataFrame columns, understand the trade-offs of each approach, and see when to use which method.

Setting Up the Example

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

print(df)

Output:

  FirstName  LastName  Age
 Michael     Smith   26
   Emily   Johnson   25
 Jessica  Williams   25
  Daniel     Brown   27
   James     Davis   28
  Daniel    Miller   30

Notice that 'Prateek' appears twice in FirstName and 'Yadav' appears twice in LastName. We want to find unique values across these columns.

Unique Values From a Single Column

Before working with multiple columns, here is the basic approach for a single column using the unique() method:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

unique_first_names = df['FirstName'].unique()
print(unique_first_names)

Output:

['Michael' 'Emily' 'Jessica' 'Daniel' 'James']

'Daniel' appears only once in the result despite being in the column twice.

Method 1: Using `pd.concat()` and `unique()`

Concatenate the desired columns into a single Series, then call unique() on the combined result. This works with any combination of data types:

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})


unique_values = pd.concat([df['FirstName'], df['LastName']]).unique()
print(unique_values)

Output:

['Michael' 'Emily' 'Jessica' 'Daniel' 'James' 'Smith' 'Johnson' 'Williams'
 'Brown' 'Davis' 'Miller']

Including Numeric Columns

You can include columns of any type - strings, integers, floats - in the concatenation:

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

unique_all = pd.concat([df['FirstName'], df['LastName'], df['Age']]).unique()
print(unique_all)

Output:

['Michael' 'Emily' 'Jessica' 'Daniel' 'James' 'Smith' 'Johnson' 'Williams'
 'Brown' 'Davis' 'Miller' 26 25 27 28 30]

tip

This is the most versatile method - it handles mixed data types (strings and numbers together) without issues.

Method 2: Using `numpy.unique()`

NumPy's unique() function works on the underlying array values. Select the desired columns and call np.unique() on the .values array:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})


unique_names = np.unique(df[['FirstName', 'LastName']].values)
print(unique_names)

Output:

['Brown' 'Daniel' 'Davis' 'Emily' 'James' 'Jessica' 'Johnson' 'Michael'
 'Miller' 'Smith' 'Williams']

Notice that the results are sorted alphabetically: np.unique() always returns sorted values, unlike Pandas' unique() which preserves the order of first appearance.

Limitation: Cannot mix string and numeric columns

np.unique() cannot handle columns with different data types in the same operation:

# ❌ This will raise an error or produce unexpected results
np.unique(df[['LastName', 'Age']].values)
# TypeError: '<' not supported between instances of 'int' and 'str'

Use Method 1 (pd.concat().unique()) when you need to combine columns of different types.

Method 3: Using Python Sets

Convert each column to a set and use the union operator (|) to combine them. Sets inherently contain only unique values:

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

unique_values = set(df['FirstName']) | set(df['LastName']) | set(df['Age'])
print(unique_values)

Output:

{'Johnson', 'Miller', 'James', 'Smith', 'Emily', 'Williams', 'Davis', 'Michael', 30, 'Brown', 'Daniel', 'Jessica', 26, 27, 28, 25}

info

Sets are unordered - the output order is not predictable. If order matters, use Method 1 instead.

This method works with any combination of data types and is particularly concise for combining many columns:

# Union of all columns at once
unique_values = set().union(*[set(df[col]) for col in df.columns])
print(sorted(unique_values, key=str))

Method 4: Using `DataFrame.melt()` and `unique()`

The melt() function unpivots the DataFrame, stacking selected columns into a single column. This is clean and expressive:

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

unique_values = df[['FirstName', 'LastName']].melt()['value'].unique()
print(unique_values)

Output:

['Michael' 'Emily' 'Jessica' 'Daniel' 'James' 'Smith' 'Johnson' 'Williams'
 'Brown' 'Davis' 'Miller']

melt() transforms the two columns into a single value column, and unique() extracts the distinct values.

Method 5: Using `DataFrame.stack()` and `unique()`

Similar to melt(), stack() pivots columns into rows:

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

unique_values = df[['FirstName', 'LastName']].stack().unique()
print(unique_values)

Output:

['Michael' 'Smith' 'Emily' 'Johnson' 'Jessica' 'Williams' 'Daniel' 'Brown'
 'James' 'Davis' 'Miller']

Finding Unique Rows (Combinations)

If you need unique combinations across columns (unique rows) rather than unique individual values, use drop_duplicates():

import pandas as pd

df_names = pd.DataFrame({
    'First': ['Alice', 'Bob', 'Alice', 'Bob'],
    'Last': ['Smith', 'Jones', 'Smith', 'Brown']
})

unique_combos = df_names.drop_duplicates()
print(unique_combos)

Output:

   First   Last
Alice  Smith
  Bob  Jones
  Bob  Brown

Row index 2 (Alice Smith) is removed because it is a duplicate of row 0.

Counting Unique Values Per Column

To see how many unique values each column has, use nunique():

import pandas as pd

df = pd.DataFrame({
    'FirstName': ['Michael', 'Emily', 'Jessica', 'Daniel', 'James', 'Daniel'],
    'LastName': ['Smith', 'Johnson', 'Williams', 'Brown', 'Davis', 'Miller'],
    'Age': [26, 25, 25, 27, 28, 30]
})

print(df.nunique())

Output:

FirstName    5
LastName     6
Age          5
dtype: int64

Comparison of Methods

Method	Mixed Types	Sorted Output	Preserves Order	Returns
`pd.concat().unique()`	✅ Yes	❌ No	✅ First appearance	NumPy array
`np.unique()`	❌ No	✅ Yes	❌ No	NumPy array
Python `set` union	✅ Yes	❌ No	❌ No (unordered)	Python set
`melt().unique()`	✅ Yes	❌ No	✅ First appearance	NumPy array
`stack().unique()`	✅ Yes	❌ No	✅ Interleaved	NumPy array

Complete Example

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'City': ['NYC', 'LA', 'NYC', 'Chicago', 'LA'],
    'State': ['NY', 'CA', 'NY', 'IL', 'CA'],
    'Country': ['USA', 'USA', 'USA', 'USA', 'USA']
})

# All unique location values across City and State
unique_locations = pd.concat([df['City'], df['State']]).unique()
print("Unique locations:", unique_locations)
print(f"Count: {len(unique_locations)}")

Output:

Unique locations: ['NYC' 'LA' 'Chicago' 'NY' 'CA' 'IL']
Count: 6

Conclusion

Extracting unique values from multiple Pandas columns can be accomplished with several approaches.

pd.concat().unique() is the most versatile: it works with any combination of data types and preserves the order of first appearance.
np.unique() is fast and returns sorted results but cannot mix string and numeric columns.
Python sets are concise and handle mixed types but produce unordered output.
melt() and stack() offer clean Pandas-native alternatives.

Choose the method that best fits your data types, ordering requirements, and coding style.

Setting Up the Example​

Unique Values From a Single Column​

Method 1: Using pd.concat() and unique()​

Including Numeric Columns​

Method 2: Using numpy.unique()​

Method 3: Using Python Sets​

Method 4: Using DataFrame.melt() and unique()​

Method 5: Using DataFrame.stack() and unique()​

Finding Unique Rows (Combinations)​

Counting Unique Values Per Column​

Comparison of Methods​

Complete Example​

Conclusion​

Table of Contents

Setting Up the Example

Unique Values From a Single Column

Method 1: Using `pd.concat()` and `unique()`

Including Numeric Columns

Method 2: Using `numpy.unique()`

Method 3: Using Python Sets

Method 4: Using `DataFrame.melt()` and `unique()`

Method 5: Using `DataFrame.stack()` and `unique()`

Finding Unique Rows (Combinations)

Counting Unique Values Per Column

Comparison of Methods

Complete Example

Conclusion