Skip to main content

Python Pandas: How to Process Date and Time with Pandas DataFrame in Python

Working with dates and times is one of the most common tasks in data analysis. Raw datetime values in a dataset often need to be transformed - extracting components like year, month, or hour, converting string formats, or filtering rows by time ranges - before they can be used effectively in analysis or modeling.

Pandas was originally built with financial modeling in mind, so it offers a rich set of tools for datetime processing. In this guide, you will learn how to extract datetime features, convert strings to timestamps, reformat date patterns, identify days of the week, and filter data by date and time ranges using Pandas.

Extracting Date and Time Components

One of the most frequent preprocessing steps is splitting a single datetime column into individual components such as year, month, day, hour, and minute. Pandas provides the .dt accessor on datetime Series for this purpose.

import pandas as pd

df = pd.DataFrame()

# Generate 6 timestamps spaced 2 hours apart
df["time"] = pd.date_range("2/5/2019", periods=6, freq="2h")

# Extract individual datetime components
df["year"] = df["time"].dt.year
df["month"] = df["time"].dt.month
df["day"] = df["time"].dt.day
df["hour"] = df["time"].dt.hour
df["minute"] = df["time"].dt.minute

print(df)

Output:

                 time  year  month  day  hour  minute
0 2019-02-05 00:00:00 2019 2 5 0 0
1 2019-02-05 02:00:00 2019 2 5 2 0
2 2019-02-05 04:00:00 2019 2 5 4 0
3 2019-02-05 06:00:00 2019 2 5 6 0
4 2019-02-05 08:00:00 2019 2 5 8 0
5 2019-02-05 10:00:00 2019 2 5 10 0
tip

The .dt accessor provides many more properties beyond the ones shown above. Some useful ones include:

  • df["time"].dt.second - seconds
  • df["time"].dt.day_of_week - day of the week as an integer (Monday=0, Sunday=6)
  • df["time"].dt.day_name() - full day name (e.g., "Monday")
  • df["time"].dt.quarter - quarter of the year (1–4)
  • df["time"].dt.is_month_end - whether the date is the last day of the month

Converting Strings to Timestamps

Real-world datasets often store dates as strings in various formats. Use pd.to_datetime() to parse these strings into proper Timestamp objects that Pandas can work with.

import numpy as np
import pandas as pd

# Date strings in a non-standard format
date_strings = np.array([
"04-03-2019 12:35 PM",
"22-06-2017 11:01 AM",
"05-09-2009 07:09 PM",
])

# Convert to datetime using the correct format string
timestamps = pd.to_datetime(date_strings, format="%d-%m-%Y %I:%M %p")

print(timestamps)

Output:

DatetimeIndex(['2019-03-04 12:35:00', '2017-06-22 11:01:00',
'2009-09-05 19:09:00'],
dtype='datetime64[ns]', freq=None)

Common Mistake: Wrong Format String

If your format string does not match the actual date pattern, pd.to_datetime() will either raise an error or produce incorrect results.

# Wrong format: month and day swapped
wrong = pd.to_datetime("04-03-2019 12:35 PM", format="%m-%d-%Y %I:%M %p")
print(wrong) # 2019-04-03 12:35:00: April 3rd instead of March 4th!

Correct approach:

# Correct format: day first, then month
correct = pd.to_datetime("04-03-2019 12:35 PM", format="%d-%m-%Y %I:%M %p")
print(correct) # 2019-03-04 12:35:00: March 4th ✓
info

Use the errors parameter to handle unparseable dates gracefully:

  • errors="raise" (default) - raises an exception on invalid dates.
  • errors="coerce" - converts invalid dates to NaT (Not a Time) instead of failing.
  • errors="ignore" - returns the original input unchanged if parsing fails.
pd.to_datetime("not-a-date", errors="coerce")
# Returns: NaT

Common Format Codes Reference

CodeMeaningExample
%Y4-digit year2019
%mMonth as zero-padded number03
%dDay as zero-padded number04
%HHour (24-hour clock)14
%IHour (12-hour clock)02
%MMinute35
%SSecond09
%pAM/PMPM
%BFull month nameFebruary
%AFull day nameMonday

Changing the Display Format of Timestamps

You can reformat datetime values into custom string patterns using the strftime() method. This is useful for creating human-readable date labels or preparing data for export.

import pandas as pd

df = pd.DataFrame()
df["time"] = pd.date_range("2/5/2019", periods=6, freq="2h")

# Reformat to "day-MonthName" pattern
df["formatted"] = df["time"].dt.strftime("%d-%B")

# Another format: "Year/Month/Day Hour:Minute"
df["custom"] = df["time"].dt.strftime("%Y/%m/%d %H:%M")

print(df)

Output:

                 time    formatted            custom
0 2019-02-05 00:00:00 05-February 2019/02/05 00:00
1 2019-02-05 02:00:00 05-February 2019/02/05 02:00
2 2019-02-05 04:00:00 05-February 2019/02/05 04:00
3 2019-02-05 06:00:00 05-February 2019/02/05 06:00
4 2019-02-05 08:00:00 05-February 2019/02/05 08:00
5 2019-02-05 10:00:00 05-February 2019/02/05 10:00
caution

The strftime() method converts datetime values to strings (dtype: object). Once converted, you can no longer use .dt accessor methods on that column. Only apply strftime() as a final formatting step, not during intermediate processing.

Extracting Day Names from Dates

To determine which day of the week a date falls on, use the day_name() method:

import pandas as pd

dates = pd.Series(pd.date_range("2/5/2019", periods=6, freq="ME"))

print("Dates:")
print(dates)
print("\nDay names:")
print(dates.dt.day_name())

Output:

Dates:
0 2019-02-28
1 2019-03-31
2 2019-04-30
3 2019-05-31
4 2019-06-30
5 2019-07-31
dtype: datetime64[ns]

Day names:
0 Thursday
1 Sunday
2 Tuesday
3 Friday
4 Sunday
5 Wednesday
dtype: object
note

In older versions of Pandas (before 0.25), the attribute dt.weekday_name was used instead of the method dt.day_name(). If you encounter AttributeError: 'DatetimeProperties' object has no attribute 'weekday_name', switch to dt.day_name().

You can also get the day of the week as an integer (Monday=0, Sunday=6):

print(dates.dt.dayofweek)

Output:

0    3
1 6
2 1
3 4
4 6
5 2
dtype: int32

Filtering Data by Date and Time Ranges

A common analysis task is selecting rows that fall within a specific date or time window. There are two main approaches depending on whether your DataFrame is indexed by datetime.

Method 1: Filtering with Boolean Conditions (No DateTime Index)

When the datetime column is a regular column (not the index), use standard comparison operators:

import pandas as pd

df = pd.DataFrame()
df["date"] = pd.date_range("1/1/2012", periods=1000, freq="h")

print("Total rows:", len(df))
print(df.head())

# Select rows between two specific datetimes
filtered = df[(df["date"] > "2012-01-01 01:00:00") &
(df["date"] <= "2012-01-01 11:00:00")]

print("\nFiltered rows:")
print(filtered)

Output:

Total rows: 1000
date
0 2012-01-01 00:00:00
1 2012-01-01 01:00:00
2 2012-01-01 02:00:00
3 2012-01-01 03:00:00
4 2012-01-01 04:00:00

Filtered rows:
date
2 2012-01-01 02:00:00
3 2012-01-01 03:00:00
4 2012-01-01 04:00:00
5 2012-01-01 05:00:00
6 2012-01-01 06:00:00
7 2012-01-01 07:00:00
8 2012-01-01 08:00:00
9 2012-01-01 09:00:00
10 2012-01-01 10:00:00
11 2012-01-01 11:00:00

Method 2: Filtering with .loc (DateTime Index)

When the DataFrame is indexed by datetime, you can use .loc with string-based slicing for cleaner, more readable code:

import pandas as pd

df = pd.DataFrame()
df["date"] = pd.date_range("1/1/2012", periods=1000, freq="h")

# Set datetime column as index
df = df.set_index("date")

# Slice using .loc: both endpoints are inclusive
filtered = df.loc["2012-01-01 04:00:00":"2012-01-01 12:00:00"]

print(filtered)

Output:

Empty DataFrame
Columns: []
Index: [2012-01-01 04:00:00, 2012-01-01 05:00:00, 2012-01-01 06:00:00, 2012-01-01 07:00:00, 2012-01-01 08:00:00, 2012-01-01 09:00:00, 2012-01-01 10:00:00, 2012-01-01 11:00:00, 2012-01-01 12:00:00]
tip

When using .loc with a datetime index, you can also use partial string indexing for broader selections:

# Select all rows from January 2012
jan_data = df.loc["2012-01"]

# Select all rows from 2012
year_data = df.loc["2012"]

This is one of the most convenient features of having a DatetimeIndex.

Method 3: Using between() for Cleaner Syntax

For a more readable alternative to chained comparisons:

import pandas as pd

df = pd.DataFrame()
df["date"] = pd.date_range("1/1/2012", periods=1000, freq="h")

filtered = df[df["date"].between("2012-01-01 04:00:00", "2012-01-01 08:00:00")]

print(filtered)

Output:

                 date
4 2012-01-01 04:00:00
5 2012-01-01 05:00:00
6 2012-01-01 06:00:00
7 2012-01-01 07:00:00
8 2012-01-01 08:00:00

Calculating Time Differences

Another common operation is computing the difference between two datetime values, which produces a Timedelta object:

import pandas as pd

df = pd.DataFrame({
"start": pd.to_datetime(["2023-01-01 08:00", "2023-01-02 09:30"]),
"end": pd.to_datetime(["2023-01-01 17:30", "2023-01-02 18:00"]),
})

df["duration"] = df["end"] - df["start"]
df["hours_worked"] = df["duration"].dt.total_seconds() / 3600

print(df)

Output:

                start                 end        duration  hours_worked
0 2023-01-01 08:00:00 2023-01-01 17:30:00 0 days 09:30:00 9.5
1 2023-01-02 09:30:00 2023-01-02 18:00:00 0 days 08:30:00 8.5

Summary

Pandas provides a comprehensive toolkit for processing dates and times in DataFrames. Here is a quick reference of the key techniques covered:

TaskMethod
Generate datetime sequencespd.date_range()
Extract components (year, month, etc.).dt.year, .dt.month, .dt.hour, etc.
Convert strings to datetimepd.to_datetime() with format parameter
Reformat datetime display.dt.strftime()
Get day of week name.dt.day_name()
Filter by date range (no index)Boolean conditions or .between()
Filter by date range (datetime index).loc[] with string slicing
Calculate time differencesSubtraction → Timedelta, then .dt.total_seconds()

By mastering these operations, you can efficiently preprocess and analyze any time-series or datetime-based dataset in Python.