Python Pandas: How to GroupBy One Column and Get Mean, Min, and Max Values in Pandas
Grouping data by a category and computing summary statistics is one of the most fundamental operations in data analysis. Pandas' groupby() function lets you split a DataFrame into groups based on one or more columns, then apply aggregation functions like mean, min, and max to each group - all in a single, efficient operation.
In this guide, you will learn how to group a DataFrame by a column and calculate multiple aggregate statistics simultaneously using the agg() method.
Basic Syntax
result = df.groupby('column_name').agg({'value_column': ['mean', 'min', 'max']})
groupby('column_name'): Splits the DataFrame into groups based on unique values in the specified column.agg(): Applies one or more aggregation functions to the specified columns.['mean', 'min', 'max']: The list of functions to compute for each group.
Example 1: Vehicle Top Speeds by Type
import pandas as pd
df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'Name': ['Kawasaki', 'Ducati Panigale', 'Bugatti Chiron',
'Jaguar XJ220', 'Lightning LS-218',
'Hennessey Venom GT', 'BMW S1000RR'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})
print("Original DataFrame:")
print(df)
Output:
Original DataFrame:
Type Name TopSpeed_MPH
0 Bike Kawasaki 186
1 Bike Ducati Panigale 202
2 Car Bugatti Chiron 304
3 Car Jaguar XJ220 210
4 Bike Lightning LS-218 218
5 Car Hennessey Venom GT 270
6 Bike BMW S1000RR 188
Now group by Type and compute the mean, min, and max of TopSpeed_MPH:
import pandas as pd
df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'Name': ['Kawasaki', 'Ducati Panigale', 'Bugatti Chiron',
'Jaguar XJ220', 'Lightning LS-218',
'Hennessey Venom GT', 'BMW S1000RR'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})
result = df.groupby('Type').agg({'TopSpeed_MPH': ['mean', 'min', 'max']})
print("Top Speed statistics grouped by Vehicle Type:")
print(result)
Output:
Top Speed statistics grouped by Vehicle Type:
TopSpeed_MPH
mean min max
Type
Bike 198.500000 186 218
Car 261.333333 210 304
Bikes average 198.5 mph while cars average 261.3 mph. The fastest vehicle overall is a car (Bugatti Chiron at 304 mph), while the fastest bike reaches 218 mph.
Example 2: Sales Data by Salesman
import pandas as pd
sales = pd.DataFrame({
'customer_id': [3005, 3001, 3002, 3009, 3005, 3007,
3002, 3004, 3009, 3008, 3003, 3002],
'salesman_id': [102, 105, 101, 103, 102, 101,
101, 106, 103, 102, 107, 101],
'purchase_amt': [1500, 2700, 1525, 1100, 948, 2400,
5700, 2000, 1280, 2500, 750, 5050]
})
result = sales.groupby('salesman_id').agg({'purchase_amt': ['mean', 'min', 'max']})
print("Purchase Amount statistics grouped by Salesman ID:")
print(result)
Output:
Purchase Amount statistics grouped by Salesman ID:
purchase_amt
mean min max
salesman_id
101 3668.750000 1525 5700
102 1649.333333 948 2500
103 1190.000000 1100 1280
105 2700.000000 2700 2700
106 2000.000000 2000 2000
107 750.000000 750 750
Salesman 101 handles the largest transactions on average and has the widest range between min and max purchases.
Example 3: Player Ages by Team
import pandas as pd
df = pd.DataFrame({
'Team': ['Radisson', 'Radisson', 'Gladiators', 'Blues',
'Gladiators', 'Blues', 'Gladiators', 'Gladiators',
'Blues', 'Blues', 'Radisson', 'Radisson'],
'Position': ['Player', 'Extras', 'Player', 'Extras',
'Extras', 'Player', 'Player', 'Player',
'Extras', 'Player', 'Player', 'Extras'],
'Age': [22, 24, 21, 29, 32, 20, 21, 23, 30, 26, 20, 31]
})
result = df.groupby('Team').agg({'Age': ['mean', 'min', 'max']})
print("Age statistics grouped by Team:")
print(result)
Output:
Age statistics grouped by Team:
Age
mean min max
Team
Blues 26.25 20 30
Gladiators 24.25 21 32
Radisson 24.25 20 31
Adding More Aggregation Functions
You can include any number of aggregation functions - including built-in functions, NumPy functions, or even custom functions:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})
result = df.groupby('Type').agg({
'TopSpeed_MPH': ['mean', 'min', 'max', 'median', 'std', 'count']
})
print(result)
Output:
TopSpeed_MPH
mean min max median std count
Type
Bike 198.500000 186 218 195.0 14.821156 4
Car 261.333333 210 304 270.0 47.595518 3
Flattening MultiIndex Columns
The agg() method produces a MultiIndex for column headers. To flatten them into simple column names:
import pandas as pd
df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})
result = df.groupby('Type').agg({'TopSpeed_MPH': ['mean', 'min', 'max']})
# Flatten column names
result.columns = ['_'.join(col) for col in result.columns]
result = result.reset_index()
print(result)
Output:
Type TopSpeed_MPH_mean TopSpeed_MPH_min TopSpeed_MPH_max
0 Bike 198.500000 186 218
1 Car 261.333333 210 304
Flattening MultiIndex columns makes the result easier to work with in downstream operations like merging, exporting to CSV, or plotting.
Aggregating Multiple Columns at Once
You can apply different aggregation functions to different columns in a single agg() call:
import pandas as pd
df = pd.DataFrame({
'Team': ['Radisson', 'Radisson', 'Gladiators', 'Blues',
'Gladiators', 'Blues', 'Gladiators', 'Radisson'],
'Age': [22, 24, 21, 29, 32, 20, 21, 31],
'Score': [88, 92, 75, 80, 95, 70, 85, 90]
})
result = df.groupby('Team').agg({
'Age': ['mean', 'min', 'max'],
'Score': ['mean', 'max']
})
print(result)
Output:
Age Score
mean min max mean max
Team
Blues 24.500000 20 29 75.0 80
Gladiators 24.666667 21 32 85.0 95
Radisson 25.666667 22 31 90.0 92
Using Named Aggregation (Pandas 0.25+)
For cleaner column naming, use named aggregation with tuples:
import pandas as pd
df = pd.DataFrame({
'Type': ['Bike', 'Bike', 'Car', 'Car', 'Bike', 'Car', 'Bike'],
'TopSpeed_MPH': [186, 202, 304, 210, 218, 270, 188]
})
result = df.groupby('Type').agg(
avg_speed=('TopSpeed_MPH', 'mean'),
min_speed=('TopSpeed_MPH', 'min'),
max_speed=('TopSpeed_MPH', 'max'),
count=('TopSpeed_MPH', 'count')
).reset_index()
print(result)
Output:
Type avg_speed min_speed max_speed count
0 Bike 198.500000 186 218 4
1 Car 261.333333 210 304 3
Named aggregation produces flat column names directly - no MultiIndex to flatten. It is the recommended approach for Pandas 0.25 and later.
Quick Reference
| Task | Code |
|---|---|
| Mean, min, max of one column | df.groupby('A').agg({'B': ['mean', 'min', 'max']}) |
| Multiple columns, different functions | df.groupby('A').agg({'B': ['mean'], 'C': ['sum']}) |
| Named aggregation | df.groupby('A').agg(avg_b=('B', 'mean')) |
| Flatten MultiIndex columns | result.columns = ['_'.join(col) for col in result.columns] |
| Reset index after groupby | result.reset_index() |
Conclusion
Pandas' groupby() combined with agg() provides a powerful and concise way to compute multiple summary statistics per group in a single operation.
Whether you need the mean, min, and max of a single column or different aggregations across multiple columns, the agg() method handles it efficiently.
For the cleanest results, use named aggregation (available in Pandas 0.25+) to produce flat, descriptive column names without manual post-processing.