Python Pandas: How to Perform Arithmetic Operations on Pandas Series
Arithmetic operations are among the most fundamental tasks when working with data in Python. Pandas Series provide a powerful and intuitive way to perform element-wise calculations using standard operators or their method equivalents. The key distinction from plain NumPy arrays is that Pandas aligns operations based on index labels, not position, which helps prevent subtle bugs but also introduces behavior you need to understand to avoid unexpected results.
This guide walks you through every aspect of arithmetic on Pandas Series, from basic operators and scalar broadcasting to handling mismatched indices with fill_value.
Basic Arithmetic Operators
Standard Python arithmetic operators work element-wise on Pandas Series. When two Series share the same index, each element is paired with the element at the matching label:
import pandas as pd
s1 = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
s2 = pd.Series([2, 5, 10], index=['a', 'b', 'c'])
print(f"Addition:\n{s1 + s2}\n")
print(f"Subtraction:\n{s1 - s2}\n")
print(f"Multiplication:\n{s1 * s2}\n")
print(f"Division:\n{s1 / s2}\n")
print(f"Power (s1 squared):\n{s1 ** 2}\n")
print(f"Floor Division:\n{s1 // s2}\n")
print(f"Modulo:\n{s1 % s2}\n")
Output:
Addition:
a 12
b 25
c 40
dtype: int64
Subtraction:
a 8
b 15
c 20
dtype: int64
Multiplication:
a 20
b 100
c 300
dtype: int64
Division:
a 5.0
b 4.0
c 3.0
dtype: float64
Power (s1 squared):
a 100
b 400
c 900
dtype: int64
Floor Division:
a 5
b 4
c 3
dtype: int64
Modulo:
a 0
b 0
c 0
dtype: int64
Every operator produces a new Series with the same index, leaving the original Series unchanged.
How Index Alignment Works
One of the most important behaviors to understand is that Pandas matches elements by their index label, not by their position in the Series. This means the physical order of elements does not matter:
import pandas as pd
s1 = pd.Series([10, 20], index=['a', 'b'])
s2 = pd.Series([100, 200], index=['b', 'a']) # Different order
result = s1 + s2
print(result)
Output:
a 210
b 120
dtype: int64
Here, a in s1 (value 10) is added to a in s2 (value 200), yielding 210. Likewise, b values 20 and 100 are paired together. Without index alignment, you would get 110 and 220 instead, which would be incorrect from a domain perspective.
Index alignment is especially useful when working with real-world datasets where rows might be sorted differently across multiple data sources. Pandas ensures correctness automatically.
Handling Mismatched Indices
When two Series have indices that do not fully overlap, the standard operators produce NaN for any label that exists in only one of the two Series:
import pandas as pd
s1 = pd.Series([10, 20], index=['a', 'b'])
s2 = pd.Series([5, 5], index=['b', 'c'])
result = s1 + s2
print(result)
Output:
a NaN
b 25.0
c NaN
dtype: float64
Label a exists only in s1, and label c exists only in s2. Since there is no matching partner for either, Pandas fills those positions with NaN. Notice that the resulting dtype is float64 because NaN is a floating-point value.
Unintentional NaN values caused by mismatched indices are one of the most common sources of bugs in Pandas code. Always inspect your result after performing arithmetic on Series that may have different indices.
Using Arithmetic Methods with fill_value
To avoid NaN when indices do not fully overlap, use the method equivalents of each operator. These methods accept a fill_value parameter that substitutes a default value for any missing label before the calculation is performed:
import pandas as pd
s1 = pd.Series([10, 20], index=['a', 'b'])
s2 = pd.Series([5, 5], index=['b', 'c'])
result = s1.add(s2, fill_value=0)
print(result)
Output:
a 10.0
b 25.0
c 5.0
dtype: float64
Now label a is computed as 10 + 0 = 10 and label c as 0 + 5 = 5. The fill_value is applied to whichever Series is missing that particular label.
The fill_value is not applied when both Series already contain the label. It only fills in for labels that are absent from one side of the operation.
Complete Method Reference
The table below maps each arithmetic operator to its equivalent Pandas method:
| Operation | Operator | Method |
|---|---|---|
| Addition | + | .add(other, fill_value=...) |
| Subtraction | - | .sub(other, fill_value=...) |
| Multiplication | * | .mul(other, fill_value=...) |
| Division | / | .div(other, fill_value=...) |
| Floor Division | // | .floordiv(other, fill_value=...) |
| Modulo | % | .mod(other, fill_value=...) |
| Power | ** | .pow(other, fill_value=...) |
Use the operator form for concise, readable code when you are confident the indices match. Use the method form when you need to handle missing indices gracefully with fill_value.
Scalar Operations (Broadcasting)
When you perform arithmetic between a Series and a single scalar value, the scalar is broadcast to every element:
import pandas as pd
s = pd.Series([10, 20, 30])
print(f"Add 5:\n{s + 5}\n")
print(f"Multiply by 2:\n{s * 2}\n")
print(f"Divide by 10:\n{s / 10}\n")
print(f"100 minus Series:\n{100 - s}\n")
Output:
Add 5:
0 15
1 25
2 35
dtype: int64
Multiply by 2:
0 20
1 40
2 60
dtype: int64
Divide by 10:
0 1.0
1 2.0
2 3.0
dtype: float64
100 minus Series:
0 90
1 80
2 70
dtype: int64
Scalar broadcasting works the same way regardless of operator direction. Both s * 2 and 2 * s produce identical results.
Practical Example: Calculating Profit Margins
Below is a realistic scenario that combines several arithmetic operations to compute profit and margin percentages from monthly revenue and cost data:
import pandas as pd
# Monthly sales data
revenue = pd.Series([1000, 1500, 1200], index=['Jan', 'Feb', 'Mar'])
costs = pd.Series([600, 800, 700], index=['Jan', 'Feb', 'Mar'])
# Calculate profit
profit = revenue - costs
print("Profit:")
print(profit)
# Calculate margin percentage
margin = (profit / revenue) * 100
print("\nMargin (%):")
print(margin.round(1))
Output:
Profit:
Jan 400
Feb 700
Mar 500
dtype: int64
Margin (%):
Jan 40.0
Feb 46.7
Mar 41.7
dtype: float64
Because both Series share the same index (Jan, Feb, Mar), every operation aligns correctly without needing fill_value.
Handling Incomplete Data in Practice
Now suppose the cost data is missing for March but includes a new month:
import pandas as pd
revenue = pd.Series([1000, 1500, 1200], index=['Jan', 'Feb', 'Mar'])
costs = pd.Series([600, 800, 500], index=['Jan', 'Feb', 'Apr'])
# Wrong approach: using the operator directly
profit_wrong = revenue - costs
print("Profit (with NaN):")
print(profit_wrong)
Output:
Profit (with NaN):
Apr NaN
Feb 700.0
Jan 400.0
Mar NaN
dtype: float64
Both Mar and Apr become NaN because they are missing from one of the two Series. A better approach uses .sub() with a fill_value:
# Better approach: treat missing costs as 0
profit_safe = revenue.sub(costs, fill_value=0)
print("Profit (with fill_value=0):")
print(profit_safe)
Output:
Profit (with fill_value=0):
Apr -500.0
Feb 700.0
Jan 400.0
Mar 1200.0
dtype: float64
Now every month has a valid result. Mar is treated as having zero cost, and Apr is treated as having zero revenue.
Choosing the right fill_value depends on your domain. Using 0 for missing costs or revenue is sensible in many business contexts, but in other scenarios a different default (or explicitly keeping NaN and handling it later) might be more appropriate.
Summary
Pandas Series arithmetic is built on two core principles: element-wise computation and index-based alignment.
- Use standard operators (
+,-,*,/,//,%,**) for clean, readable code when indices match. - Switch to the equivalent methods (
.add(),.sub(),.mul(),.div(),.floordiv(),.mod(),.pow()) with thefill_valueparameter when you need to handle missing or mismatched indices gracefully. - Always remember that Pandas aligns by label, which prevents position-based mistakes but can introduce
NaNfor non-overlapping indices if you are not careful.