Skip to main content

How to Sort a CSV File by Multiple Columns in Python

Sorting CSV data by multiple columns is a common task in data analysis: for example, sorting sales records first by region and then by revenue within each region, or ordering student data by grade and then by name. Python's Pandas library makes this straightforward by loading the CSV into a DataFrame and using the sort_values() method.

This guide covers sorting by one or more columns with mixed ascending and descending orders.

Understanding sort_values() Syntax

DataFrame.sort_values(
by,
axis=0,
ascending=True,
inplace=False,
kind='quicksort',
na_position='last',
ignore_index=False
)
ParameterDescriptionDefault
byColumn name or list of column names to sort byRequired
axis0 to sort rows0
ascendingTrue for ascending, False for descending. Pass a list for mixed ordersTrue
inplaceIf True, modifies the DataFrame directlyFalse
na_positionWhere to place NaN values: 'first' or 'last''last'
ignore_indexIf True, resets the index after sortingFalse

Sample CSV File

The examples use a CSV file called products.csv with the following content:

products.csv
Product,Category,Price,Rating
Laptop,Electronics,999.99,4.5
Mouse,Electronics,29.99,4.2
Desk,Furniture,249.99,4.7
Chair,Furniture,199.99,4.3
Keyboard,Electronics,79.99,4.6
Bookshelf,Furniture,149.99,4.1
Monitor,Electronics,399.99,4.8
Lamp,Furniture,49.99,3.9

Sorting by a Single Column

Start by loading the CSV and sorting by one column:

import pandas as pd

data = pd.read_csv('products.csv')

# Sort by Price in ascending order
sorted_data = data.sort_values(by='Price')
print(sorted_data)

Output:

     Product     Category   Price  Rating
1 Mouse Electronics 29.99 4.2
7 Lamp Furniture 49.99 3.9
4 Keyboard Electronics 79.99 4.6
5 Bookshelf Furniture 149.99 4.1
3 Chair Furniture 199.99 4.3
2 Desk Furniture 249.99 4.7
6 Monitor Electronics 399.99 4.8
0 Laptop Electronics 999.99 4.5
tip

When sorting by a single column, you can pass the column name directly as a string: sort_values(by='Price'). For multiple columns, pass a list: sort_values(by=['col1', 'col2']).

Sorting by Multiple Columns (Ascending)

To sort by more than one column, pass a list of column names. The DataFrame is sorted by the first column in the list, and ties are broken by the second column, and so on:

import pandas as pd

data = pd.read_csv('products.csv')

# Sort by Category first, then by Price within each category
sorted_data = data.sort_values(by=['Category', 'Price'], ignore_index=True)
print(sorted_data)

Output:

     Product     Category   Price  Rating
0 Mouse Electronics 29.99 4.2
1 Keyboard Electronics 79.99 4.6
2 Monitor Electronics 399.99 4.8
3 Laptop Electronics 999.99 4.5
4 Lamp Furniture 49.99 3.9
5 Bookshelf Furniture 149.99 4.1
6 Chair Furniture 199.99 4.3
7 Desk Furniture 249.99 4.7

Products are grouped by Category (Electronics first, then Furniture), and within each category, they are sorted by Price from lowest to highest.

Sorting with Mixed Ascending and Descending Orders

Pass a list of booleans to the ascending parameter to control the direction for each column independently:

import pandas as pd

data = pd.read_csv('products.csv')

# Sort by Category ascending, then by Price descending within each category
sorted_data = data.sort_values(
by=['Category', 'Price'],
ascending=[True, False],
ignore_index=True
)
print(sorted_data)

Output:

     Product     Category   Price  Rating
0 Laptop Electronics 999.99 4.5
1 Monitor Electronics 399.99 4.8
2 Keyboard Electronics 79.99 4.6
3 Mouse Electronics 29.99 4.2
4 Desk Furniture 249.99 4.7
5 Chair Furniture 199.99 4.3
6 Bookshelf Furniture 149.99 4.1
7 Lamp Furniture 49.99 3.9

Within each category, products are now sorted from most expensive to least expensive.

Sorting by Three or More Columns

You can sort by as many columns as needed. Each additional column acts as a further tiebreaker:

import pandas as pd

data = pd.read_csv('products.csv')

# Sort by Category (asc), Rating (desc), then Price (asc)
sorted_data = data.sort_values(
by=['Category', 'Rating', 'Price'],
ascending=[True, False, True],
ignore_index=True
)
print(sorted_data)

Output:

     Product     Category   Price  Rating
0 Monitor Electronics 399.99 4.8
1 Keyboard Electronics 79.99 4.6
2 Laptop Electronics 999.99 4.5
3 Mouse Electronics 29.99 4.2
4 Desk Furniture 249.99 4.7
5 Chair Furniture 199.99 4.3
6 Bookshelf Furniture 149.99 4.1
7 Lamp Furniture 49.99 3.9

Within Electronics, products are sorted by highest rating first. If two products had the same rating, the cheaper one would appear first.

Saving the Sorted Data Back to CSV

After sorting, you can write the result to a new CSV file:

import pandas as pd

data = pd.read_csv('products.csv')

sorted_data = data.sort_values(
by=['Category', 'Price'],
ascending=[True, True],
ignore_index=True
)

# Save to a new CSV file
sorted_data.to_csv('products_sorted.csv', index=False)
print("Sorted CSV saved successfully.")
note

Setting index=False prevents the DataFrame index from being written as an extra column in the CSV.

Handling NaN Values During Sorting

CSV files often contain missing data. The na_position parameter controls where NaN values are placed:

import pandas as pd
import numpy as np

data = pd.DataFrame({
'Product': ['Laptop', 'Mouse', 'Desk', 'Chair', 'Keyboard'],
'Category': ['Electronics', 'Electronics', 'Furniture', None, 'Electronics'],
'Price': [999.99, 29.99, np.nan, 199.99, 79.99]
})

# NaN values placed at the beginning
sorted_first = data.sort_values(by='Price', na_position='first')
print("NaN first:")
print(sorted_first)

# NaN values placed at the end (default)
sorted_last = data.sort_values(by='Price', na_position='last')
print("\nNaN last:")
print(sorted_last)

Output:

NaN first:
Product Category Price
2 Desk Furniture NaN
1 Mouse Electronics 29.99
4 Keyboard Electronics 79.99
3 Chair None 199.99
0 Laptop Electronics 999.99

NaN last:
Product Category Price
1 Mouse Electronics 29.99
4 Keyboard Electronics 79.99
3 Chair None 199.99
0 Laptop Electronics 999.99
2 Desk Furniture NaN

Common Mistake: Mismatched ascending List Length

When sorting by multiple columns, the ascending parameter must have the same number of elements as the by parameter:

import pandas as pd

data = pd.read_csv('products.csv')

# WRONG: 2 columns but only 1 ascending value (as a list)
try:
sorted_data = data.sort_values(
by=['Category', 'Price'],
ascending=[True] # Should have 2 values
)
except ValueError as e:
print(f"ValueError: {e}")

Output:

ValueError: Length of ascending (1) != length of by (2)

The correct approach:

# CORRECT: match the number of ascending values to the number of columns
sorted_data = data.sort_values(
by=['Category', 'Price'],
ascending=[True, False] # One value per column
)
warning

If you want the same direction for all columns, pass a single boolean (not a list): ascending=True or ascending=False. Use a list only when columns need different sort directions.

Complete Workflow: Load, Sort, and Save

Here is a complete example combining all steps:

import pandas as pd

# Step 1: Load the CSV
data = pd.read_csv('products.csv')
print("Original data:")
print(data)

# Step 2: Sort by multiple columns
sorted_data = data.sort_values(
by=['Category', 'Rating', 'Price'],
ascending=[True, False, True],
ignore_index=True
)
print("\nSorted data:")
print(sorted_data)

# Step 3: Save the sorted data
sorted_data.to_csv('products_sorted.csv', index=False)
print("\nSaved to products_sorted.csv")

Output:

Original data:
Product Category Price Rating
0 Laptop Electronics 999.99 4.5
1 Mouse Electronics 29.99 4.2
2 Desk Furniture 249.99 4.7
3 Chair Furniture 199.99 4.3
4 Keyboard Electronics 79.99 4.6
5 Bookshelf Furniture 149.99 4.1
6 Monitor Electronics 399.99 4.8
7 Lamp Furniture 49.99 3.9

Sorted data:
Product Category Price Rating
0 Monitor Electronics 399.99 4.8
1 Keyboard Electronics 79.99 4.6
2 Laptop Electronics 999.99 4.5
3 Mouse Electronics 29.99 4.2
4 Desk Furniture 249.99 4.7
5 Chair Furniture 199.99 4.3
6 Bookshelf Furniture 149.99 4.1
7 Lamp Furniture 49.99 3.9

Saved to products_sorted.csv

Quick Reference

GoalCode
Sort by one column (ascending)df.sort_values(by='col')
Sort by one column (descending)df.sort_values(by='col', ascending=False)
Sort by multiple columns (same direction)df.sort_values(by=['col1', 'col2'])
Sort by multiple columns (mixed directions)df.sort_values(by=['col1', 'col2'], ascending=[True, False])
Sort and reset indexdf.sort_values(by='col', ignore_index=True)
Sort and modify original DataFramedf.sort_values(by='col', inplace=True)
Control NaN placementdf.sort_values(by='col', na_position='first')
Save sorted DataFrame to CSVdf.to_csv('output.csv', index=False)

Sorting a CSV file by multiple columns in Python is a three-step process: load with read_csv(), sort with sort_values(), and optionally save with to_csv(). The key to multi-column sorting is providing a list of column names in priority order and matching ascending values for each column when mixed directions are needed.