Skip to main content

Python Pandas: How to Sort a Pandas DataFrame by Date

Datasets frequently contain date columns - order dates, timestamps, expiration dates, or event schedules. Sorting by date is essential for time-series analysis, chronological reporting, and identifying trends. However, date columns are often read as strings by default, which causes incorrect sorting (alphabetical instead of chronological).

This guide walks you through converting date strings to proper datetime objects and sorting your DataFrame by one or multiple date columns.

Why Date Conversion Matters Before Sorting

When Pandas reads a CSV or when you create a DataFrame with date strings, the values are stored as strings, not datetime objects. Sorting strings alphabetically produces different results than sorting dates chronologically:

import pandas as pd

dates = pd.Series(['2024-12-01', '2024-02-15', '2024-11-20'])

# String sorting (alphabetical), works correctly for YYYY-MM-DD format
print("String sort:", dates.sort_values().tolist())

# But this fails with other formats
dates_us = pd.Series(['12/01/2024', '02/15/2024', '11/20/2024'])
print("String sort (MM/DD):", dates_us.sort_values().tolist())

Output:

String sort: ['2024-02-15', '2024-11-20', '2024-12-01']
String sort (MM/DD): ['02/15/2024', '11/20/2024', '12/01/2024']

The YYYY-MM-DD format happens to sort correctly as strings, but MM/DD/YYYY does not - February would come before November alphabetically, which is coincidentally correct here, but 12/01 would sort after 02/15 for the wrong reason. Converting to datetime ensures reliable chronological sorting regardless of the string format.

Sorting by a Single Date Column

Step 1: Create or Load the DataFrame

import pandas as pd

data = pd.DataFrame({
'AdmissionDate': ['2024-01-25', '2024-01-22', '2024-01-20',
'2024-01-18', '2024-01-22', '2024-01-17', '2024-01-21'],
'StudentID': [7, 5, 3, 2, 6, 1, 4],
'Name': ['Ram', 'Shyam', 'Mohan', 'Sohan', 'Lucky', 'Abhinav', 'Danny'],
'Stream': ['CSE', 'ECE', 'Civil', 'Mechanical', 'CSE', 'IT', 'EEE']
})

print(data)
print("\nAdmissionDate dtype:", data['AdmissionDate'].dtype)

Output:

  AdmissionDate  StudentID     Name      Stream
0 2024-01-25 7 Ram CSE
1 2024-01-22 5 Shyam ECE
2 2024-01-20 3 Mohan Civil
3 2024-01-18 2 Sohan Mechanical
4 2024-01-22 6 Lucky CSE
5 2024-01-17 1 Abhinav IT
6 2024-01-21 4 Danny EEE

AdmissionDate dtype: object

The AdmissionDate column has dtype object, meaning it contains strings.

Step 2: Convert Strings to Datetime

Use pd.to_datetime() to convert the column:

data['AdmissionDate'] = pd.to_datetime(data['AdmissionDate'])

print("AdmissionDate dtype:", data['AdmissionDate'].dtype)

Output:

AdmissionDate dtype: datetime64[ns]

The column is now a proper datetime type and will sort chronologically.

Step 3: Sort by Date

# Sort in ascending order (earliest first)
sorted_asc = data.sort_values(by='AdmissionDate')
print(sorted_asc)

Output:

  AdmissionDate  StudentID     Name      Stream
5 2024-01-17 1 Abhinav IT
3 2024-01-18 2 Sohan Mechanical
2 2024-01-20 3 Mohan Civil
6 2024-01-21 4 Danny EEE
1 2024-01-22 5 Shyam ECE
4 2024-01-22 6 Lucky CSE
0 2024-01-25 7 Ram CSE

Records are now in chronological order from the earliest to the latest admission date.

Sorting in Descending Order (Most Recent First)

sorted_desc = data.sort_values(by='AdmissionDate', ascending=False)
print(sorted_desc)

Output:

  AdmissionDate  StudentID     Name      Stream
0 2024-01-25 7 Ram CSE
1 2024-01-22 5 Shyam ECE
4 2024-01-22 6 Lucky CSE
6 2024-01-21 4 Danny EEE
2 2024-01-20 3 Mohan Civil
3 2024-01-18 2 Sohan Mechanical
5 2024-01-17 1 Abhinav IT

Resetting the Index After Sorting

After sorting, the original index values are preserved (e.g., 5, 3, 2, 6, ...). To get a clean sequential index:

sorted_clean = data.sort_values(by='AdmissionDate', ignore_index=True)
print(sorted_clean)

Output:

  AdmissionDate  StudentID     Name      Stream
0 2024-01-17 1 Abhinav IT
1 2024-01-18 2 Sohan Mechanical
2 2024-01-20 3 Mohan Civil
3 2024-01-21 4 Danny EEE
4 2024-01-22 5 Shyam ECE
5 2024-01-22 6 Lucky CSE
6 2024-01-25 7 Ram CSE

Sorting by Multiple Date Columns

When a DataFrame has more than one date column, you can specify a priority order. The DataFrame is sorted by the first column, and ties are broken by the second.

import pandas as pd

products = pd.DataFrame({
'Mfg_Date': ['2024-01-25', '2024-01-22', '2024-01-20',
'2024-01-18', '2024-01-22', '2024-01-17', '2024-01-21'],
'ProductID': [7, 5, 3, 2, 6, 1, 4],
'Product': ['Paracetamol', 'Moov', 'Volini', 'Crocin',
'Aciloc', 'Iodex', 'Combiflam'],
'Expiry_Date': ['2025-01-25', '2026-01-22', '2024-05-20',
'2025-03-18', '2025-01-22', '2024-05-17', '2025-01-30']
})

# Convert both columns to datetime
products['Mfg_Date'] = pd.to_datetime(products['Mfg_Date'])
products['Expiry_Date'] = pd.to_datetime(products['Expiry_Date'])

# Sort by Expiry_Date first, then by Mfg_Date for ties
sorted_products = products.sort_values(
by=['Expiry_Date', 'Mfg_Date'],
ignore_index=True
)
print(sorted_products)

Output:

    Mfg_Date  ProductID      Product Expiry_Date
0 2024-01-17 1 Iodex 2024-05-17
1 2024-01-20 3 Volini 2024-05-20
2 2024-01-22 6 Aciloc 2025-01-22
3 2024-01-25 7 Paracetamol 2025-01-25
4 2024-01-21 4 Combiflam 2025-01-30
5 2024-01-18 2 Crocin 2025-03-18
6 2024-01-22 5 Moov 2026-01-22

Products are sorted by expiry date (soonest first), with manufacturing date as the tiebreaker.

Mixed Ascending and Descending

You can sort each date column in a different direction:

# Expiry date ascending, manufacturing date descending
sorted_mixed = products.sort_values(
by=['Expiry_Date', 'Mfg_Date'],
ascending=[True, False],
ignore_index=True
)
print(sorted_mixed)

Output:

    Mfg_Date  ProductID      Product Expiry_Date
0 2024-01-17 1 Iodex 2024-05-17
1 2024-01-20 3 Volini 2024-05-20
2 2024-01-22 6 Aciloc 2025-01-22
3 2024-01-25 7 Paracetamol 2025-01-25
4 2024-01-21 4 Combiflam 2025-01-30
5 2024-01-18 2 Crocin 2025-03-18
6 2024-01-22 5 Moov 2026-01-22

Converting Multiple Date Columns at Once

When you have several date columns, convert them all in a single operation using apply():

date_columns = ['Mfg_Date', 'Expiry_Date']
products[date_columns] = products[date_columns].apply(pd.to_datetime)

Output:

    Mfg_Date  ProductID      Product Expiry_Date
0 2024-01-25 7 Paracetamol 2025-01-25
1 2024-01-22 5 Moov 2026-01-22
2 2024-01-20 3 Volini 2024-05-20
3 2024-01-18 2 Crocin 2025-03-18
4 2024-01-22 6 Aciloc 2025-01-22
5 2024-01-17 1 Iodex 2024-05-17
6 2024-01-21 4 Combiflam 2025-01-30

Parsing Dates During CSV Loading

Instead of converting after loading, you can parse dates directly in read_csv():

import pandas as pd

df = pd.read_csv('data.csv', parse_dates=['AdmissionDate'])
print(df['AdmissionDate'].dtype)

Output:

datetime64[ns]

The column is already a datetime type - no separate conversion step needed.

tip

Using parse_dates in read_csv() is the cleanest approach. It handles conversion automatically and works with most standard date formats. For non-standard formats, combine it with date_format:

df = pd.read_csv('data.csv', parse_dates=['date_col'], date_format='%d/%m/%Y')

Common Mistake: Sorting Without Converting to Datetime First

Sorting a date column that is still stored as strings can produce incorrect results with certain formats:

import pandas as pd

df = pd.DataFrame({
'Date': ['03/15/2024', '12/01/2023', '01/20/2024'],
'Event': ['Conference', 'Workshop', 'Meetup']
})

# WRONG: sorting strings, alphabetical order, not chronological
sorted_wrong = df.sort_values(by='Date')
print("String sort (incorrect):")
print(sorted_wrong)

Output:

String sort (incorrect):
Date Event
2 01/20/2024 Meetup
0 03/15/2024 Conference
1 12/01/2023 Workshop

December 2023 should come first chronologically, but it appears last because "12" comes after "01" and "03" alphabetically.

The correct approach:

import pandas as pd

df = pd.DataFrame({
'Date': ['03/15/2024', '12/01/2023', '01/20/2024'],
'Event': ['Conference', 'Workshop', 'Meetup']
})

# CORRECT: convert to datetime first, then sort
df['Date'] = pd.to_datetime(df['Date'])
sorted_correct = df.sort_values(by='Date')
print("Datetime sort (correct):")
print(sorted_correct)

Output:

Datetime sort (correct):
Date Event
1 2023-12-01 Workshop
2 2024-01-20 Meetup
0 2024-03-15 Conference
danger

Always convert date columns to datetime64 before sorting. String-based sorting only works correctly for the YYYY-MM-DD format and fails silently with other formats, producing results that look plausible but are chronologically wrong.

Quick Reference

StepCode
Convert string to datetimedf['date'] = pd.to_datetime(df['date'])
Sort ascending (oldest first)df.sort_values(by='date')
Sort descending (newest first)df.sort_values(by='date', ascending=False)
Sort by multiple date columnsdf.sort_values(by=['date1', 'date2'])
Reset index after sortingdf.sort_values(by='date', ignore_index=True)
Parse dates during CSV loadpd.read_csv('file.csv', parse_dates=['date'])
Convert multiple columns at oncedf[cols].apply(pd.to_datetime)

Converting date columns to proper datetime objects and then using sort_values() is the reliable pattern for chronological sorting in Pandas. This two-step process ensures correct ordering regardless of the original date string format.