Skip to main content

Python Pandas: How to Calculate the Number of Months Between Two Dates in Pandas

Calculating the time difference between two dates is a fundamental task in data analysis - whether you are computing customer tenure, contract duration, employee experience, or age. While Pandas makes it easy to find differences in days, converting that difference to months, weeks, or years requires a bit more work.

In this guide, you will learn how to calculate the number of months (and other time units) between two date columns in a Pandas DataFrame using multiple approaches.

Setting Up the Example

import pandas as pd
import numpy as np
import datetime

df = pd.DataFrame({
'start_date': [datetime.datetime(1998, 6, 20), datetime.datetime(2012, 10, 18)],
'end_date': [datetime.datetime(2000, 10, 19), datetime.datetime(2021, 1, 8)]
})

# Ensure columns are datetime type
df['start_date'] = pd.to_datetime(df['start_date'])
df['end_date'] = pd.to_datetime(df['end_date'])

print(df)

Output:

  start_date   end_date
0 1998-06-20 2000-10-19
1 2012-10-18 2021-01-08

Calculating the Difference in Months

Method 1: Using np.timedelta64 With Day-Based Approximation

Subtract the two date columns to get a Timedelta, then convert to months by dividing by the average number of days per month (~30.44):

import pandas as pd
import numpy as np
import datetime

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

# Calculate difference in days, then convert to months
df['months'] = ((df['end_date'] - df['start_date']) / np.timedelta64(1, 'D') / 30.44).astype(int)

print(df)

Output:

  start_date   end_date  months
0 1998-06-20 2000-10-19 27
1 2012-10-18 2021-01-08 98
info

Using 30.44 days per month is an approximation (365.25 / 12 = 30.4375). For most analytical purposes this is sufficiently accurate, but for exact calendar-month differences, use Method 2.

Method 2: Using Year and Month Arithmetic (Exact Calendar Months)

For an exact calendar-month calculation, use the year and month components of each date:

import pandas as pd

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

# Exact calendar months
df['months'] = (
(df['end_date'].dt.year - df['start_date'].dt.year) * 12 +
(df['end_date'].dt.month - df['start_date'].dt.month)
)

print(df)

Output:

  start_date   end_date  months
0 1998-06-20 2000-10-19 28
1 2012-10-18 2021-01-08 99

This method counts the number of full calendar-month boundaries crossed, regardless of exact days.

When to use which method
MethodAccuracyBest For
Day-based approximation~±1 monthQuick estimates, aggregations
Year-month arithmeticExact calendar monthsBilling cycles, contract terms, reporting

Method 3: Using pd.DateOffset and relativedelta

For the most precise calculation that accounts for varying month lengths, use dateutil.relativedelta:

import pandas as pd
from dateutil.relativedelta import relativedelta

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

def months_between(start, end):
"""Calculate exact months between two dates."""
delta = relativedelta(end, start)
return delta.years * 12 + delta.months

df['months'] = df.apply(lambda row: months_between(row['start_date'], row['end_date']), axis=1)

print(df)

Output:

  start_date   end_date  months
0 1998-06-20 2000-10-19 27
1 2012-10-18 2021-01-08 98
Performance note

Using apply() with relativedelta is accurate but slower than vectorized operations. For large DataFrames (millions of rows), prefer Method 1 or Method 2.

Calculating Other Time Units

The same pattern works for days, weeks, and years by changing the divisor or np.timedelta64 unit.

Difference in Days

import pandas as pd
from dateutil.relativedelta import relativedelta

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

df['days'] = (df['end_date'] - df['start_date']).dt.days

print(df[['start_date', 'end_date', 'days']])

Output:

  start_date   end_date  days
0 1998-06-20 2000-10-19 852
1 2012-10-18 2021-01-08 3004

The .dt.days accessor directly returns the number of days as an integer - no conversion needed.

Difference in Weeks

import pandas as pd
import numpy as np
from dateutil.relativedelta import relativedelta

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})


df['weeks'] = ((df['end_date'] - df['start_date']) / np.timedelta64(1, 'W')).astype(int)

print(df[['start_date', 'end_date', 'weeks']])

Output:

  start_date   end_date  weeks
0 1998-06-20 2000-10-19 121
1 2012-10-18 2021-01-08 429

Difference in Years

import pandas as pd
from dateutil.relativedelta import relativedelta

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

df['years'] = df.apply(
lambda row: relativedelta(row['end_date'], row['start_date']).years,
axis=1
)

print(df[['start_date', 'end_date', 'years']])

Output:

  start_date   end_date  years
0 1998-06-20 2000-10-19 2
1 2012-10-18 2021-01-08 8

Complete Example: All Time Units Together

import pandas as pd
import numpy as np

df = pd.DataFrame({
'start_date': pd.to_datetime(['1998-06-20', '2012-10-18']),
'end_date': pd.to_datetime(['2000-10-19', '2021-01-08'])
})

# Calculate time difference
diff = df['end_date'] - df['start_date']

df['days'] = diff.dt.days
df['weeks'] = (diff / np.timedelta64(1, 'W')).astype(int)

# Total calendar months difference
df['months'] = (
(df['end_date'].dt.year - df['start_date'].dt.year) * 12 +
(df['end_date'].dt.month - df['start_date'].dt.month)
)

# Full calendar years difference (correct way)
df['years'] = (
df['end_date'].dt.year - df['start_date'].dt.year
- (
(df['end_date'].dt.month < df['start_date'].dt.month) |
(
(df['end_date'].dt.month == df['start_date'].dt.month) &
(df['end_date'].dt.day < df['start_date'].dt.day)
)
)
)

print(df)

Output:

  start_date   end_date  days  weeks  months  years
0 1998-06-20 2000-10-19 852 121 28 2
1 2012-10-18 2021-01-08 3004 429 99 8

Common Mistake: Not Converting to Datetime First

If your date columns are strings, arithmetic will fail:

# ❌ Strings can't be subtracted
df = pd.DataFrame({
'start': ['2020-01-01', '2021-06-15'],
'end': ['2023-01-01', '2024-06-15']
})

df['diff'] = df['end'] - df['start']
# TypeError: unsupported operand type(s) for -: 'str' and 'str'

Fix: Always convert to datetime first:

# ✅ Convert to datetime before calculating
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
df['diff_months'] = (
(df['end'].dt.year - df['start'].dt.year) * 12 +
(df['end'].dt.month - df['start'].dt.month)
)

Quick Reference

Time UnitCode
Days(df['end'] - df['start']).dt.days
Weeks((df['end'] - df['start']) / np.timedelta64(1, 'W')).astype(int)
Months (approx.)((df['end'] - df['start']).dt.days / 30.44).astype(int)
Months (exact)(end.dt.year - start.dt.year) * 12 + (end.dt.month - start.dt.month)
Years((df['end'] - df['start']) / np.timedelta64(1, 'Y')).astype(int)

Conclusion

Calculating the number of months between two dates in Pandas can be done with day-based approximation (dividing by 30.44) for quick estimates, year-month arithmetic for exact calendar-month counts, or relativedelta for the most precise results.

For days and weeks, Pandas' built-in Timedelta operations and np.timedelta64 provide fast, vectorized calculations.

Always ensure your date columns are converted to datetime type before performing arithmetic, and choose the method that best matches your precision requirements and dataset size.