Python Pandas: How to Apply a Function to Rows or Columns in Pandas

The .apply() method in Pandas lets you run custom functions across every row or column of a DataFrame. It is one of the most flexible tools available for data transformation, enabling everything from simple calculations to complex multi-column logic. However, understanding when .apply() is the right choice and when vectorized operations would be faster is key to writing efficient Pandas code.

In this guide, you will learn how to use .apply() along both axes, pass additional arguments to your functions, and recognize situations where a vectorized alternative would be significantly more performant.

Applying a Function to Each Column (`axis=0`)

By default, .apply() passes each column as a Series to your function. This is useful for computing summary statistics or transformations that operate on an entire column at once:

import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [10, 20, 30]
})

# Calculate the range (max - min) of each column
result = df.apply(lambda col: col.max() - col.min())

print(result)

Output:

A     2
B    20
dtype: int64

The function receives column A as a Series, computes 3 - 1 = 2, then receives column B and computes 30 - 10 = 20.

More Column-Wise Examples

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})

# Standard deviation of each column
print(df.apply(np.std))

# Count non-zero values per column
print(df.apply(lambda col: (col != 0).sum()))

# Normalize each column to zero mean and unit variance
normalized = df.apply(lambda col: (col - col.mean()) / col.std())
print(normalized)

Output:

A    0.816497
B    0.816497
dtype: float64
A    3
B    3
dtype: int64
     A    B
0 -1.0 -1.0
1  0.0  0.0
2  1.0  1.0

Applying a Function to Each Row (`axis=1`)

Setting axis=1 passes each row as a Series to your function. This is the mode you need when your logic depends on values from multiple columns in the same row:

import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [10, 20, 30]
})

# Sum values across each row
df['Total'] = df.apply(lambda row: row['A'] + row['B'], axis=1)

print(df)

Output:

   A   B  Total
1  10     11
2  20     22
3  30     33

Row-Wise Examples with Multiple Columns

import pandas as pd

df = pd.DataFrame({
    'First': ['John', 'Jane'],
    'Last': ['Doe', 'Smith'],
    'Age': [25, 30]
})

# Combine columns into a full name
df['Full_Name'] = df.apply(
    lambda row: f"{row['First']} {row['Last']}", axis=1
)

# Conditional logic based on row values
df['Category'] = df.apply(
    lambda row: 'Senior' if row['Age'] >= 30 else 'Junior', axis=1
)

print(df)

Output:

  First   Last  Age   Full_Name Category
0  John    Doe   25    John Doe   Junior
1  Jane  Smith   30  Jane Smith   Senior

Passing Extra Arguments to Your Function

When your function needs additional parameters beyond the row or column data, use the args parameter for positional arguments or pass keyword arguments directly:

import pandas as pd

df = pd.DataFrame({'Score': [85, 90, 78]})

# Using positional arguments with args
def add_bonus(x, bonus):
    return x + bonus

df['With_Bonus'] = df['Score'].apply(add_bonus, args=(5,))

# Using keyword arguments
def scale_value(x, factor=1, offset=0):
    return x * factor + offset

df['Scaled'] = df['Score'].apply(scale_value, factor=1.1, offset=2)

print(df)

Output:

   Score  With_Bonus  Scaled
   85          90    95.5
   90          95   101.0
   78          83    87.8

When to Prefer Vectorized Operations Over `.apply()`

While .apply() is flexible, it processes rows or columns one at a time through Python, which is much slower than NumPy's vectorized operations that run in optimized C code. For simple arithmetic, comparisons, or conditions, vectorized alternatives are almost always better.

Common Replacements

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(10000)})

# Simple arithmetic
# Slow
df['B'] = df['A'].apply(lambda x: x + 10)
# Fast
df['B'] = df['A'] + 10

# Conditional logic
# Slow
df['C'] = df['A'].apply(lambda x: 'High' if x > 5000 else 'Low')
# Fast
df['C'] = np.where(df['A'] > 5000, 'High', 'Low')

Output:

         A      B     C
      0     10   Low
      1     11   Low
      2     12   Low
      3     13   Low
      4     14   Low
...    ...    ...   ...
9995  10005  High
9996  10006  High
9997  10007  High
9998  10008  High
9999  10009  High

[10000 rows x 3 columns]

Performance Comparison

The speed difference becomes substantial on larger DataFrames:

import pandas as pd
import time

df = pd.DataFrame({'A': range(100000)})

# Using apply
start = time.time()
for _ in range(10):
    df['B'] = df['A'].apply(lambda x: x * 2)
apply_time = time.time() - start

# Using vectorized multiplication
start = time.time()
for _ in range(10):
    df['B'] = df['A'] * 2
vector_time = time.time() - start

print(f"Apply: {apply_time:.4f}s")
print(f"Vectorized: {vector_time:.4f}s")
print(f"Vectorized is ~{apply_time / vector_time:.0f}x faster")

Output:

Apply: 0.4077s
Vectorized: 0.0026s
Vectorized is ~155x faster

tip

Reserve .apply() for complex logic that genuinely cannot be expressed with vectorized operations. For simple math, comparisons, and string operations that have .str accessor equivalents, always use the vectorized approach first.

Good Use Cases for `.apply()`

Despite the performance considerations, .apply() is the right tool when your transformation involves complex logic, external libraries, or operations that do not have a vectorized equivalent:

Complex String Processing

import pandas as pd
import re

df = pd.DataFrame({'Text': ['Hello World', 'Foo Bar 123', 'Test!']})

def extract_alpha_word_count(text):
    """Count only alphabetic words, ignoring numbers and punctuation."""
    return len(re.findall(r'\b[A-Za-z]+\b', text))

df['Word_Count'] = df['Text'].apply(extract_alpha_word_count)

print(df)

Output:

          Text  Word_Count
Hello World           2
Foo Bar 123           2
      Test!           1

Multi-Column Business Logic

import pandas as pd

df = pd.DataFrame({
    'Base_Price': [100, 250, 50],
    'Quantity': [2, 1, 5],
    'Member': [True, False, True]
})

def calculate_total(row):
    """Apply tiered discounts based on quantity and membership."""
    subtotal = row['Base_Price'] * row['Quantity']
    if row['Member']:
        subtotal *= 0.9  # 10% member discount
    if row['Quantity'] >= 5:
        subtotal *= 0.95  # Additional 5% bulk discount
    return round(subtotal, 2)

df['Total'] = df.apply(calculate_total, axis=1)

print(df)

Output:

   Base_Price  Quantity  Member   Total
       100         2    True  180.00
       250         1   False  250.00
        50         5    True  213.75

This kind of nested conditional logic with multiple interacting conditions is difficult to express cleanly with np.where() or np.select(), making .apply() the practical choice.

Quick Reference

Axis	Direction	Each Function Call Receives
`axis=0` (default)	Column-wise	Entire column as a Series
`axis=1`	Row-wise	Entire row as a Series

Operation Type	Recommended Approach
Simple arithmetic	Vectorized: `df['A'] + df['B']`
Simple conditions	`np.where()` or `np.select()`
Complex multi-column logic	`.apply(func, axis=1)`
Built-in string methods	`.str` accessor (e.g., `df['col'].str.upper()`)
Custom string processing	`.apply(func)`

Use .apply(func, axis=0) for column-wise operations and axis=1 for row-wise operations.
Pass extra arguments via args=(val,) or as keyword arguments directly.
Always check whether a vectorized alternative exists before reaching for .apply(), as vectorized operations are typically 10 to 100 times faster for simple transformations.

Applying a Function to Each Column (axis=0)​

More Column-Wise Examples​

Applying a Function to Each Row (axis=1)​

Row-Wise Examples with Multiple Columns​

Passing Extra Arguments to Your Function​

When to Prefer Vectorized Operations Over .apply()​

Common Replacements​

Performance Comparison​

Good Use Cases for .apply()​

Complex String Processing​

Multi-Column Business Logic​

Quick Reference​

Table of Contents

Applying a Function to Each Column (`axis=0`)

More Column-Wise Examples

Applying a Function to Each Row (`axis=1`)

Row-Wise Examples with Multiple Columns

Passing Extra Arguments to Your Function

When to Prefer Vectorized Operations Over `.apply()`

Common Replacements

Performance Comparison

Good Use Cases for `.apply()`

Complex String Processing

Multi-Column Business Logic

Quick Reference