Skip to main content

Python Pandas: How to GroupBy One Column and Get Mean, Min, and Max Values in Pandas

Grouping data by a category and computing summary statistics is one of the most fundamental operations in data analysis. Pandas' groupby() function lets you split a DataFrame into groups based on one or more columns, then apply aggregation functions like mean, min, and max to each group - all in a single, efficient operation.

In this guide, you will learn how to group a DataFrame by a column and calculate multiple aggregate statistics simultaneously using the agg() method.

Basic Syntax

result = df.groupby('column_name').agg({'value_column': ['mean', 'min', 'max']})
  • groupby('column_name'): Splits the DataFrame into groups based on unique values in the specified column.
  • agg(): Applies one or more aggregation functions to the specified columns.
  • ['mean', 'min', 'max']: The list of functions to compute for each group.

Example 1: Vehicle Top Speeds by Type

import pandas as pd

df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'Name': ['Kawasaki', 'Ducati Panigale', 'Bugatti Chiron',
'Jaguar XJ220', 'Lightning LS-218',
'Hennessey Venom GT', 'BMW S1000RR'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})

print("Original DataFrame:")
print(df)

Output:

Original DataFrame:
Type Name TopSpeed_MPH
0 Bike Kawasaki 186
1 Bike Ducati Panigale 202
2 Car Bugatti Chiron 304
3 Car Jaguar XJ220 210
4 Bike Lightning LS-218 218
5 Car Hennessey Venom GT 270
6 Bike BMW S1000RR 188

Now group by Type and compute the mean, min, and max of TopSpeed_MPH:

import pandas as pd

df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'Name': ['Kawasaki', 'Ducati Panigale', 'Bugatti Chiron',
'Jaguar XJ220', 'Lightning LS-218',
'Hennessey Venom GT', 'BMW S1000RR'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})


result = df.groupby('Type').agg({'TopSpeed_MPH': ['mean', 'min', 'max']})

print("Top Speed statistics grouped by Vehicle Type:")
print(result)

Output:

Top Speed statistics grouped by Vehicle Type:
TopSpeed_MPH
mean min max
Type
Bike 198.500000 186 218
Car 261.333333 210 304

Bikes average 198.5 mph while cars average 261.3 mph. The fastest vehicle overall is a car (Bugatti Chiron at 304 mph), while the fastest bike reaches 218 mph.

Example 2: Sales Data by Salesman

import pandas as pd

sales = pd.DataFrame({
'customer_id': [3005, 3001, 3002, 3009, 3005, 3007,
3002, 3004, 3009, 3008, 3003, 3002],
'salesman_id': [102, 105, 101, 103, 102, 101,
101, 106, 103, 102, 107, 101],
'purchase_amt': [1500, 2700, 1525, 1100, 948, 2400,
5700, 2000, 1280, 2500, 750, 5050]
})

result = sales.groupby('salesman_id').agg({'purchase_amt': ['mean', 'min', 'max']})

print("Purchase Amount statistics grouped by Salesman ID:")
print(result)

Output:

Purchase Amount statistics grouped by Salesman ID:
purchase_amt
mean min max
salesman_id
101 3668.750000 1525 5700
102 1649.333333 948 2500
103 1190.000000 1100 1280
105 2700.000000 2700 2700
106 2000.000000 2000 2000
107 750.000000 750 750

Salesman 101 handles the largest transactions on average and has the widest range between min and max purchases.

Example 3: Player Ages by Team

import pandas as pd

df = pd.DataFrame({
'Team': ['Radisson', 'Radisson', 'Gladiators', 'Blues',
'Gladiators', 'Blues', 'Gladiators', 'Gladiators',
'Blues', 'Blues', 'Radisson', 'Radisson'],
'Position': ['Player', 'Extras', 'Player', 'Extras',
'Extras', 'Player', 'Player', 'Player',
'Extras', 'Player', 'Player', 'Extras'],
'Age': [22, 24, 21, 29, 32, 20, 21, 23, 30, 26, 20, 31]
})

result = df.groupby('Team').agg({'Age': ['mean', 'min', 'max']})

print("Age statistics grouped by Team:")
print(result)

Output:

Age statistics grouped by Team:
Age
mean min max
Team
Blues 26.25 20 30
Gladiators 24.25 21 32
Radisson 24.25 20 31

Adding More Aggregation Functions

You can include any number of aggregation functions - including built-in functions, NumPy functions, or even custom functions:

import pandas as pd
import numpy as np

df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})

result = df.groupby('Type').agg({
'TopSpeed_MPH': ['mean', 'min', 'max', 'median', 'std', 'count']
})

print(result)

Output:

     TopSpeed_MPH                                  
mean min max median std count
Type
Bike 198.500000 186 218 195.0 14.821156 4
Car 261.333333 210 304 270.0 47.595518 3

Flattening MultiIndex Columns

The agg() method produces a MultiIndex for column headers. To flatten them into simple column names:

import pandas as pd

df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})

result = df.groupby('Type').agg({'TopSpeed_MPH': ['mean', 'min', 'max']})

# Flatten column names
result.columns = ['_'.join(col) for col in result.columns]
result = result.reset_index()

print(result)

Output:

   Type  TopSpeed_MPH_mean  TopSpeed_MPH_min  TopSpeed_MPH_max
0 Bike 198.500000 186 218
1 Car 261.333333 210 304
tip

Flattening MultiIndex columns makes the result easier to work with in downstream operations like merging, exporting to CSV, or plotting.

Aggregating Multiple Columns at Once

You can apply different aggregation functions to different columns in a single agg() call:

import pandas as pd

df = pd.DataFrame({
'Team': ['Radisson', 'Radisson', 'Gladiators', 'Blues',
'Gladiators', 'Blues', 'Gladiators', 'Radisson'],
'Age': [22, 24, 21, 29, 32, 20, 21, 31],
'Score': [88, 92, 75, 80, 95, 70, 85, 90]
})

result = df.groupby('Team').agg({
'Age': ['mean', 'min', 'max'],
'Score': ['mean', 'max']
})

print(result)

Output:

                  Age         Score    
mean min max mean max
Team
Blues 24.500000 20 29 75.0 80
Gladiators 24.666667 21 32 85.0 95
Radisson 25.666667 22 31 90.0 92

Using Named Aggregation (Pandas 0.25+)

For cleaner column naming, use named aggregation with tuples:

import pandas as pd

df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})

result = df.groupby('Type').agg(
avg_speed=('TopSpeed_MPH', 'mean'),
min_speed=('TopSpeed_MPH', 'min'),
max_speed=('TopSpeed_MPH', 'max'),
count=('TopSpeed_MPH', 'count')
).reset_index()

print(result)

Output:

   Type   avg_speed  min_speed  max_speed  count
0 Bike 198.500000 186 218 4
1 Car 261.333333 210 304 3
info

Named aggregation produces flat column names directly - no MultiIndex to flatten. It is the recommended approach for Pandas 0.25 and later.

Quick Reference

TaskCode
Mean, min, max of one columndf.groupby('A').agg({'B': ['mean', 'min', 'max']})
Multiple columns, different functionsdf.groupby('A').agg({'B': ['mean'], 'C': ['sum']})
Named aggregationdf.groupby('A').agg(avg_b=('B', 'mean'))
Flatten MultiIndex columnsresult.columns = ['_'.join(col) for col in result.columns]
Reset index after groupbyresult.reset_index()

Conclusion

Pandas' groupby() combined with agg() provides a powerful and concise way to compute multiple summary statistics per group in a single operation.

Whether you need the mean, min, and max of a single column or different aggregations across multiple columns, the agg() method handles it efficiently.

For the cleanest results, use named aggregation (available in Pandas 0.25+) to produce flat, descriptive column names without manual post-processing.