Python Pandas: How to Apply Functions to Specific Columns or Rows in Pandas
When transforming data in Pandas, you rarely need to apply a function to an entire DataFrame at once. Most real-world datasets contain a mix of numeric, string, and datetime columns, and applying a numeric function indiscriminately will raise errors on incompatible types. Targeting specific columns, rows, or even individual cells for transformation is essential for writing correct, efficient, and maintainable data processing code.
In this guide, you will learn how to apply functions to single columns, multiple columns, specific rows, and combinations of both. You will also see why vectorized alternatives should be your first choice and when .apply() is genuinely necessary.
Applying a Function to a Single Column
Select the column as a Series and call .apply() with your function:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Score': [100, 200, 300]
})
# Apply sqrt only to the Score column
df['Score'] = df['Score'].apply(np.sqrt)
print(df)
Output:
Name Score
0 Alice 10.000000
1 Bob 14.142136
2 Charlie 17.320508
The Name column is left completely untouched because we only selected Score for the transformation.
Vectorized Alternative
For functions that have NumPy equivalents, the vectorized version is significantly faster and more readable:
# Faster than .apply() for simple operations
df['Score'] = np.sqrt(df['Score'])
Both approaches produce identical results, but the vectorized version avoids the overhead of calling a Python function on each element individually.
Applying a Function to Multiple Columns
Select several columns as a sub-DataFrame by passing a list of column names:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [10, 20, 30],
'C': ['x', 'y', 'z'] # Non-numeric column
})
# Apply only to numeric columns A and B
cols_to_transform = ['A', 'B']
df[cols_to_transform] = df[cols_to_transform].apply(lambda x: x + 100)
print(df)
Output:
A B C
0 101 110 x
1 102 120 y
2 103 130 z
Column C is excluded from the operation, preventing the type error that would occur if you tried adding 100 to string values.
Element-Wise Operations with .map()
When you need to apply a function to each individual cell rather than each column as a whole, use .map() on the selected columns:
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Square each element in selected columns
df[['A', 'B']] = df[['A', 'B']].map(lambda x: x ** 2)
print(df)
Output:
A B
0 1 9
1 4 16
In Pandas versions before 2.1, this method was called .applymap(). It has been renamed to .map() at the DataFrame level starting in Pandas 2.1. If you are using an older version, replace .map() with .applymap() in the example above.
Applying a Function to Specific Rows
By Label with .loc[]
Use .loc[] to target rows by their index labels:
import pandas as pd
df = pd.DataFrame({
'Value': [10, 20, 30, 40]
}, index=['a', 'b', 'c', 'd'])
# Double only rows 'b' and 'd'
rows_to_change = ['b', 'd']
df.loc[rows_to_change] = df.loc[rows_to_change].apply(lambda x: x * 2)
print(df)
Output:
Value
a 10
b 40
c 30
d 80
By Position with .iloc[]
Use .iloc[] when you want to target rows by their integer position:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3, 4, 5]})
# Apply to the first 3 rows only
df.iloc[:3] = df.iloc[:3].apply(lambda x: x * 10)
print(df)
Output:
A
0 10
1 20
2 30
3 4
4 5
Targeting Specific Rows and Columns Together
Combine row and column selection to apply a function to a precise rectangular region of the DataFrame:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [10, 20, 30],
'C': [100, 200, 300]
}, index=['x', 'y', 'z'])
# Double only columns A and B for rows x and y
df.loc[['x', 'y'], ['A', 'B']] = df.loc[['x', 'y'], ['A', 'B']] * 2
print(df)
Output:
A B C
x 2 20 100
y 4 40 200
z 3 30 300
Row z and column C remain unchanged because they were not included in the selection.
Why Targeting Specific Columns Matters
Applying numeric functions to a DataFrame that contains non-numeric columns will raise errors. This is one of the most common reasons to target specific columns rather than operating on the entire DataFrame.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Score': [85, 92]
})
# This will fail because sqrt cannot be applied to strings
try:
result = df.apply(np.sqrt)
except TypeError as e:
print(f"Error: {e}")
Output:
Error: loop of ufunc does not support argument 0 of type str which has no callable sqrt method
The fix is to target only the column that should receive the transformation:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Score': [85, 92]
})
df['Score'] = np.sqrt(df['Score'])
print(df)
Output:
Name Score
0 Alice 9.219544
1 Bob 9.591663
Conditional Application
Sometimes you need to apply a function only to rows that meet a specific condition. Use a boolean mask with .loc[] to select those rows:
import pandas as pd
df = pd.DataFrame({
'Value': [10, 25, 50, 75, 100]
})
# Double only values greater than 30
mask = df['Value'] > 30
df.loc[mask, 'Value'] = df.loc[mask, 'Value'].apply(lambda x: x * 2)
print(df)
Output:
Value
0 10
1 25
2 100
3 150
4 200
For simple conditional transformations like this, np.where() provides a cleaner and faster alternative:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Value': [10, 25, 50, 75, 100]
})
# Same result, more concise
df['Value'] = np.where(df['Value'] > 30, df['Value'] * 2, df['Value'])
print(df)
Output:
Value
0 10
1 25
2 100
3 150
4 200
Use np.where() for simple "if-else" conditions and np.select() for multiple conditions. Reserve the .loc[] plus .apply() pattern for cases where the transformation logic is too complex for a single expression.
Automatically Selecting Columns by Data Type
When you want to apply a function to all numeric columns without listing them manually, use select_dtypes():
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Score': [85, 92],
'Grade': [3.8, 3.9],
'City': ['NYC', 'LA']
})
# Select only numeric columns
numeric_cols = df.select_dtypes(include='number').columns
df[numeric_cols] = df[numeric_cols] * 2
print(df)
Output:
Name Score Grade City
0 Alice 170 7.6 NYC
1 Bob 184 7.8 LA
This approach is especially useful when working with DataFrames that have many columns and you do not want to maintain a hardcoded list of column names.
Quick Reference
| Target | Method |
|---|---|
| Single column | df['col'].apply(func) |
| Multiple columns | df[['c1', 'c2']].apply(func) |
| Specific rows by label | df.loc[['r1', 'r2']].apply(func) |
| Specific rows by position | df.iloc[0:3].apply(func) |
| Rows and columns together | df.loc[rows, cols].apply(func) |
| Conditional rows | df.loc[mask, 'col'].apply(func) |
| All numeric columns | df.select_dtypes(include='number') |
- Select specific columns with
df[['col1', 'col2']]and specific rows withdf.loc[rows]ordf.iloc[positions]before applying functions. - This prevents type errors on mixed DataFrames and focuses the computation only where it is needed.
- Always consider vectorized alternatives first, as they are typically 10 to 100 times faster than
.apply()for straightforward transformations.