Python Pandas: How to Calculate the Number of Months Between Two Dates in Pandas
Calculating the time difference between two dates is a fundamental task in data analysis - whether you are computing customer tenure, contract duration, employee experience, or age. While Pandas makes it easy to find differences in days, converting that difference to months, weeks, or years requires a bit more work.
In this guide, you will learn how to calculate the number of months (and other time units) between two date columns in a Pandas DataFrame using multiple approaches.
Setting Up the Example
import pandas as pd
import numpy as np
import datetime
df = pd.DataFrame({
'start_date': [datetime.datetime(1998, 6, 20), datetime.datetime(2012, 10, 18)],
'end_date': [datetime.datetime(2000, 10, 19), datetime.datetime(2021, 1, 8)]
})
# Ensure columns are datetime type
df['start_date'] = pd.to_datetime(df['start_date'])
df['end_date'] = pd.to_datetime(df['end_date'])
print(df)
Output:
start_date end_date
0 1998-06-20 2000-10-19
1 2012-10-18 2021-01-08
Calculating the Difference in Months
Method 1: Using np.timedelta64 With Day-Based Approximation
Subtract the two date columns to get a Timedelta, then convert to months by dividing by the average number of days per month (~30.44):
import pandas as pd
import numpy as np
import datetime
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
# Calculate difference in days, then convert to months
df['months'] = ((df['end_date'] - df['start_date']) / np.timedelta64(1, 'D') / 30.44).astype(int)
print(df)
Output:
start_date end_date months
0 1998-06-20 2000-10-19 27
1 2012-10-18 2021-01-08 98
Using 30.44 days per month is an approximation (365.25 / 12 = 30.4375). For most analytical purposes this is sufficiently accurate, but for exact calendar-month differences, use Method 2.
Method 2: Using Year and Month Arithmetic (Exact Calendar Months)
For an exact calendar-month calculation, use the year and month components of each date:
import pandas as pd
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
# Exact calendar months
df['months'] = (
(df['end_date'].dt.year - df['start_date'].dt.year) * 12 +
(df['end_date'].dt.month - df['start_date'].dt.month)
)
print(df)
Output:
start_date end_date months
0 1998-06-20 2000-10-19 28
1 2012-10-18 2021-01-08 99
This method counts the number of full calendar-month boundaries crossed, regardless of exact days.
| Method | Accuracy | Best For |
|---|---|---|
| Day-based approximation | ~±1 month | Quick estimates, aggregations |
| Year-month arithmetic | Exact calendar months | Billing cycles, contract terms, reporting |
Method 3: Using pd.DateOffset and relativedelta
For the most precise calculation that accounts for varying month lengths, use dateutil.relativedelta:
import pandas as pd
from dateutil.relativedelta import relativedelta
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
def months_between(start, end):
"""Calculate exact months between two dates."""
delta = relativedelta(end, start)
return delta.years * 12 + delta.months
df['months'] = df.apply(lambda row: months_between(row['start_date'], row['end_date']), axis=1)
print(df)
Output:
start_date end_date months
0 1998-06-20 2000-10-19 27
1 2012-10-18 2021-01-08 98
Using apply() with relativedelta is accurate but slower than vectorized operations. For large DataFrames (millions of rows), prefer Method 1 or Method 2.
Calculating Other Time Units
The same pattern works for days, weeks, and years by changing the divisor or np.timedelta64 unit.
Difference in Days
import pandas as pd
from dateutil.relativedelta import relativedelta
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
df['days'] = (df['end_date'] - df['start_date']).dt.days
print(df[['start_date', 'end_date', 'days']])
Output:
start_date end_date days
0 1998-06-20 2000-10-19 852
1 2012-10-18 2021-01-08 3004
The .dt.days accessor directly returns the number of days as an integer - no conversion needed.
Difference in Weeks
import pandas as pd
import numpy as np
from dateutil.relativedelta import relativedelta
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
df['weeks'] = ((df['end_date'] - df['start_date']) / np.timedelta64(1, 'W')).astype(int)
print(df[['start_date', 'end_date', 'weeks']])
Output:
start_date end_date weeks
0 1998-06-20 2000-10-19 121
1 2012-10-18 2021-01-08 429
Difference in Years
import pandas as pd
from dateutil.relativedelta import relativedelta
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
df['years'] = df.apply(
lambda row: relativedelta(row['end_date'], row['start_date']).years,
axis=1
)
print(df[['start_date', 'end_date', 'years']])
Output:
start_date end_date years
0 1998-06-20 2000-10-19 2
1 2012-10-18 2021-01-08 8
Complete Example: All Time Units Together
import pandas as pd
import numpy as np
df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})
# Calculate time difference
diff = df['end_date'] - df['start_date']
df['days'] = diff.dt.days
df['weeks'] = (diff / np.timedelta64(1, 'W')).astype(int)
# Total calendar months difference
df['months'] = (
(df['end_date'].dt.year - df['start_date'].dt.year) * 12 +
(df['end_date'].dt.month - df['start_date'].dt.month)
)
# Full calendar years difference (correct way)
df['years'] = (
df['end_date'].dt.year - df['start_date'].dt.year
- (
(df['end_date'].dt.month < df['start_date'].dt.month) |
(
(df['end_date'].dt.month == df['start_date'].dt.month) &
(df['end_date'].dt.day < df['start_date'].dt.day)
)
)
)
print(df)
Output:
start_date end_date days weeks months years
0 1998-06-20 2000-10-19 852 121 28 2
1 2012-10-18 2021-01-08 3004 429 99 8
Common Mistake: Not Converting to Datetime First
If your date columns are strings, arithmetic will fail:
# ❌ Strings can't be subtracted
df = pd.DataFrame({
'start': ['2020-01-01', '2021-06-15'],
'end': ['2023-01-01', '2024-06-15']
})
df['diff'] = df['end'] - df['start']
# TypeError: unsupported operand type(s) for -: 'str' and 'str'
Fix: Always convert to datetime first:
# ✅ Convert to datetime before calculating
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
df['diff_months'] = (
(df['end'].dt.year - df['start'].dt.year) * 12 +
(df['end'].dt.month - df['start'].dt.month)
)
Quick Reference
| Time Unit | Code |
|---|---|
| Days | (df['end'] - df['start']).dt.days |
| Weeks | ((df['end'] - df['start']) / np.timedelta64(1, 'W')).astype(int) |
| Months (approx.) | ((df['end'] - df['start']).dt.days / 30.44).astype(int) |
| Months (exact) | (end.dt.year - start.dt.year) * 12 + (end.dt.month - start.dt.month) |
| Years | ((df['end'] - df['start']) / np.timedelta64(1, 'Y')).astype(int) |
Conclusion
Calculating the number of months between two dates in Pandas can be done with day-based approximation (dividing by 30.44) for quick estimates, year-month arithmetic for exact calendar-month counts, or relativedelta for the most precise results.
For days and weeks, Pandas' built-in Timedelta operations and np.timedelta64 provide fast, vectorized calculations.
Always ensure your date columns are converted to datetime type before performing arithmetic, and choose the method that best matches your precision requirements and dataset size.