SQL GROUP BY for Grouping Results

In the previous guide, you learned how aggregate functions condense an entire table into a single summary row. COUNT(*) tells you how many total products you have. AVG(price) gives you one average across all products. But what if you need the average price per category? Or the number of orders per customer? Or the total revenue per month?

These questions require the database to split the data into groups first, then aggregate each group separately. This is exactly what GROUP BY does. It is one of the most powerful clauses in SQL and the foundation of every meaningful report, dashboard, and analytical query.

This guide covers how GROUP BY works conceptually, how to group by one or multiple columns, the critical error that every beginner encounters (selecting non-aggregated columns), and practical real-world examples. Everything uses the ShopSmart sample database with full outputs (we defined it in a previous guide here).

How GROUP BY Works Conceptually

Without GROUP BY, aggregate functions process all matching rows as one big group and return a single result:

-- One group, one result: the average price across ALL products
SELECT AVG(price) AS avg_price
FROM products;

avg_price
61.045

With GROUP BY, the database first divides the rows into groups based on the values in one or more columns, then applies the aggregate function to each group separately, returning one result row per group:

-- Multiple groups, one result per group: average price PER category
SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;

category_id	avg_price
1	57.495
2	38.745000000000005
3	129.99
4	57.663333333333334

Instead of one average for all products, you now have a separate average for each category.

The Mental Model

Think of GROUP BY as a three-step process:

Step 1: Sort rows into groups based on the GROUP BY column:

Group (category_id = 1)	Group (category_id = 2)	Group (category_id = 3)	Group (category_id = 4)
Wireless Mouse ($29.99)	SQL for Beginners ($34.99)	Coffee Maker Pro ($129.99)	Yoga Mat Premium ($38.00)
Mechanical Keyboard ($89.99)	Data Science Handbook ($42.50)		Running Shoes X1 ($110.00)
USB-C Hub ($45.00)			Stainless Water Bottle ($24.99)
Bluetooth Speaker ($65.00)

Step 2: Apply the aggregate function to each group independently:

Category 1: AVG(29.99, 89.99, 45.00, 65.00) = 57.50
Category 2: AVG(34.99, 42.50) = 38.75
Category 3: AVG(129.99) = 129.99
Category 4: AVG(38.00, 110.00, 24.99) = 57.66

Step 3: Return one row per group with the group identifier and the aggregated value.

The Key Insight

GROUP BY does not change which rows are included. It changes how the rows are organized before aggregation. Without it, all rows form one group. With it, rows are split into multiple groups, and each group gets its own summary.

Grouping by One Column

The most common pattern is grouping by a single column to get summaries per category, per status, per customer, or per any other dimension.

Products Per Category

SELECT category_id, COUNT(*) AS product_count
FROM products
GROUP BY category_id
ORDER BY product_count DESC;

Output:

category_id	product_count
1	4
4	3
2	2
3	1

Electronics (category 1) has the most products. Home & Kitchen (category 3) has only one.

Orders Per Status

SELECT status, COUNT(*) AS order_count
FROM orders
GROUP BY status
ORDER BY order_count DESC;

Output:

status	order_count
completed	3
pending	3
shipped	2

Revenue Per Status

SELECT status,
       COUNT(*) AS order_count,
       SUM(total_amount) AS total_revenue,
       ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;

Output:

status	order_count	total_revenue	avg_order_value
pending	3	297.96999999999997	99.32
completed	3	287.46	95.82
shipped	2	284.99	142.50

Each status group gets its own count, total, and average. This single query could power an order management dashboard.

Orders Per Customer

SELECT customer_id,
       COUNT(*) AS order_count,
       SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;

Output:

customer_id	order_count	total_spent
6	1	199.98
1	2	197.47
4	1	155.00
2	2	152.99
3	1	129.99
5	1	34.99

Customer 1 (Alice) has spent the most overall with 2 orders totaling $197.47. Customer 6 (Frank) has the highest single order at $199.98.

Customers Per City

SELECT city, COUNT(*) AS customer_count
FROM customers
WHERE city IS NOT NULL
GROUP BY city
ORDER BY customer_count DESC;

Output:

city	customer_count
New York	2
Chicago	2
Los Angeles	1
Seattle	1

Notice the WHERE city IS NOT NULL filter. Without it, NULL would form its own group. Whether that is desirable depends on your use case.

What Happens to NULL in GROUP BY?

Unlike most SQL operations where NULL creates confusion, GROUP BY treats all NULL values as belonging to the same group:

-- NULL forms its own group
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city
ORDER BY customer_count DESC;

Output:

city	customer_count
New York	2
Chicago	2
Los Angeles	1
Seattle	1
NULL	1

Grace's NULL city creates a group of its own. This is consistent with how DISTINCT treats NULL values.

Grouping by Multiple Columns

You can group by two or more columns to create more specific groupings. The database creates a separate group for each unique combination of values across all grouped columns.

Orders Per Customer Per Status

SELECT customer_id, status, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id, status
ORDER BY customer_id, status;

Output:

customer_id	status	order_count
1	completed	2
2	completed	1
2	pending	1
3	shipped	1
4	shipped	1
5	pending	1
6	pending	1

Customer 2 (Bob) appears in two rows because he has orders in two different statuses. Each unique combination of customer_id and status gets its own group.

Compare this with the single-column grouping:

-- Single-column: 6 groups (one per customer)
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;
-- Returns 6 rows

-- Multi-column: 7 groups (one per customer-status combination)
SELECT customer_id, status, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id, status;
-- Returns 7 rows (Bob has 2 rows now)

Products Per Category and Availability

SELECT category_id,
       is_available,
       COUNT(*) AS product_count,
       ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id, is_available
ORDER BY category_id, is_available DESC;

Output:

category_id	is_available	product_count	avg_price
1	true	3	54.99
1	false	1	65.00
2	true	2	38.75
3	true	1	129.99
4	true	3	57.66

Category 1 (Electronics) splits into two groups: 3 available products averaging $54.99 and 1 unavailable product at $65.00 (the Bluetooth Speaker).

Orders Per Month and Status

-- Extract year-month from order dates for monthly grouping
-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'YYYY-MM') AS order_month,
       status,
       COUNT(*) AS order_count,
       SUM(total_amount) AS revenue
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM'), status
ORDER BY order_month, status;

Output:

order_month	status	order_count	revenue
2024-01	completed	2	209.97
2024-02	completed	1	77.49
2024-03	pending	1	34.99
2024-03	shipped	2	284.99
2024-04	pending	2	262.98

Date Grouping Syntax Varies by Database

The function to extract a formatted date string differs across databases:

-- PostgreSQL
TO_CHAR(order_date, 'YYYY-MM')

-- MySQL
DATE_FORMAT(order_date, '%Y-%m')

-- SQLite
STRFTIME('%Y-%m', order_date)

-- SQL Server
FORMAT(order_date, 'yyyy-MM')

The concept is the same in every database. Only the function name and format string differ.

Adding Context with JOINs

Grouping by category_id or customer_id produces correct numbers, but raw IDs are not very readable. Joining with related tables lets you display names alongside the aggregated data.

Product Counts with Category Names

SELECT c.name AS category,
       COUNT(p.id) AS product_count,
       ROUND(AVG(p.price), 2) AS avg_price,
       MIN(p.price) AS cheapest,
       MAX(p.price) AS most_expensive
FROM categories c
LEFT JOIN products p ON c.id = p.category_id
GROUP BY c.name
ORDER BY product_count DESC;

Output:

category	product_count	avg_price	cheapest	most_expensive
Electronics	4	57.50	29.99	89.99
Sports	3	57.66	24.99	110.00
Books	2	38.75	34.99	42.50
Home & Kitchen	1	129.99	129.99	129.99

Now the report shows "Electronics" instead of "1", which is far more useful for anyone reading it.

Customer Order Summary with Names

SELECT c.first_name || ' ' || c.last_name AS customer,
       c.city,
       COUNT(o.id) AS order_count,
       COALESCE(SUM(o.total_amount), 0) AS total_spent,
       COALESCE(ROUND(AVG(o.total_amount), 2), 0) AS avg_order_value
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.first_name, c.last_name, c.city
ORDER BY total_spent DESC;

Output:

customer	city	order_count	total_spent	avg_order_value
Frank Wilson	Chicago	1	199.98	199.98
Alice Johnson	New York	2	197.47	98.73
David Chen	New York	1	155.00	155.00
Bob Martinez	Los Angeles	2	152.99	76.50
Carol Singh	Chicago	1	129.99	129.99
Eva Brown	Seattle	1	34.99	34.99
Grace Taylor	NULL	0	0	0

The LEFT JOIN ensures Grace appears even though she has no orders. COALESCE converts her NULL totals to 0.

Why GROUP BY c.id?

When grouping by customer, include the primary key (c.id) in the GROUP BY clause. This guarantees each customer gets their own group, even if two customers share the same name. The additional columns (first_name, last_name, city) are functionally dependent on c.id, so including them is safe and necessary for display.

The Most Common GROUP BY Error

This is the single most important section of this guide. Almost every SQL beginner encounters this error, and understanding why it happens will permanently improve your SQL writing.

The Error

-- This query FAILS in most databases
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;

Error message (PostgreSQL):

ERROR: column "products.name" must appear in the GROUP BY clause
or be used in an aggregate function

Error message (SQL Server):

Column 'products.name' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.

Why It Fails

Let us trace through the logic. When you write GROUP BY category_id, the database creates one group per category:

Group: category_id = 1	Group: category_id = 2
Wireless Mouse	SQL for Beginners
Mechanical Keyboard	Data Science Handbook
USB-C Hub
Bluetooth Speaker

For category 1, AVG(price) produces one number: 57.50. But name has four different values in this group. Which name should the database display? "Wireless Mouse"? "Mechanical Keyboard"? There is no logical answer. The database refuses to guess and raises an error instead.

The Rule

Every column in your SELECT list must either:

Appear in the GROUP BY clause, or
Be used inside an aggregate function (COUNT, SUM, AVG, MIN, MAX)

No exceptions (with one caveat for MySQL, discussed below).

How to Fix It

There are several valid approaches depending on what you actually need:

Fix 1: Remove the non-aggregated column

-- If you just want the average price per category, you don't need the name
SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;

Fix 2: Add the column to GROUP BY

-- This creates a group for each unique (category_id, name) combination
-- Probably NOT what you want, as each product becomes its own group
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id, name;

Each product is now its own group, and the "average" is just each product's own price. This is technically valid but defeats the purpose of aggregation.

Fix 3: Aggregate the column

-- Use MIN or MAX on the name column if you want a representative name
SELECT MIN(name) AS first_product_name,
       category_id,
       ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id;

Fix 4: Use a JOIN or subquery to bring in non-grouped data

This is the most robust approach when you need detailed information alongside grouped summaries, but it requires more advanced techniques covered in future guides.

Multiple Wrong Examples and Their Fixes

-- Wrong: 'email' is not grouped or aggregated
SELECT city, email, COUNT(*) AS customer_count
FROM customers
GROUP BY city;
-- ERROR

-- Fix: Remove email or aggregate it
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city;

-- Wrong: 'order_date' is not grouped or aggregated
SELECT customer_id, order_date, SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id;
-- ERROR

-- Fix option 1: Add order_date to GROUP BY (groups by customer AND date)
SELECT customer_id, order_date, SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id, order_date;

-- Fix option 2: Aggregate order_date
SELECT customer_id,
       MIN(order_date) AS first_order,
       MAX(order_date) AS last_order,
       SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id;

MySQL's Dangerous Exception

MySQL with the default sql_mode setting may not raise an error when you select non-aggregated columns without grouping them. Instead, it silently returns an arbitrary value from the group.

-- MySQL (default settings): No error, but WRONG results
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;
-- Returns one row per category, but the 'name' value is random/unpredictable

This is not a feature; it is a trap. The name value you see has no logical relationship to the average. Enable ONLY_FULL_GROUP_BY mode in MySQL to enforce the standard behavior:

SET sql_mode = 'ONLY_FULL_GROUP_BY';

With this mode enabled, MySQL behaves like PostgreSQL and SQL Server, rejecting queries with non-aggregated columns not in GROUP BY.

GROUP BY with WHERE

WHERE filters rows before grouping. The database removes rows that do not match the WHERE condition, then groups the remaining rows.

-- Count available products per category
SELECT category_id, COUNT(*) AS available_count
FROM products
WHERE is_available = true
GROUP BY category_id
ORDER BY available_count DESC;

Output:

category_id	available_count
4	3
1	3
2	2
3	1

Compare this with the unfiltered count:

-- Count ALL products per category (including unavailable)
SELECT category_id, COUNT(*) AS total_count
FROM products
GROUP BY category_id
ORDER BY total_count DESC;

category_id	total_count
1	4
4	3
2	2
3	1

Category 1 drops from 4 to 3 because the Bluetooth Speaker (unavailable) is excluded by WHERE.

Revenue from Completed Orders Per Customer

SELECT customer_id,
       COUNT(*) AS completed_orders,
       SUM(total_amount) AS completed_revenue
FROM orders
WHERE status = 'completed'
GROUP BY customer_id
ORDER BY completed_revenue DESC;

Output:

customer_id	completed_orders	completed_revenue
1	2	197.47
2	1	89.99

Only Alice and Bob have completed orders.

The Processing Order

Understanding the order in which SQL processes clauses is critical when using GROUP BY:

Step	Clause	What Happens
1	`FROM`	Identify the source table(s)
2	`WHERE`	Filter individual rows
3	`GROUP BY`	Divide remaining rows into groups
4	`HAVING`	Filter groups (covered in the next guide)
5	`SELECT`	Calculate expressions, apply aliases
6	`ORDER BY`	Sort the final results
7	`LIMIT`	Restrict the number of rows returned

This order explains several important behaviors:

WHERE cannot use aggregate functions (aggregation has not happened yet at step 2)
WHERE cannot use column aliases (aliases are not created until step 5)
ORDER BY can use aliases (it runs after SELECT)
GROUP BY cannot use column aliases in most databases (it runs before SELECT)

GROUP BY with ORDER BY

You can sort grouped results just like any other query. Sorting by the aggregate result is particularly useful:

Categories Ranked by Average Price

SELECT category_id,
       ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id
ORDER BY avg_price DESC;

Output:

category_id	avg_price
3	129.99
4	57.66
1	57.49
2	38.75

Customers Ranked by Total Spending

SELECT customer_id,
       SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;

Output:

customer_id	total_spent
6	199.98
1	197.47
4	155.00
2	152.99
3	129.99
5	34.99

Top 3 Categories by Product Count

Combining GROUP BY, ORDER BY, and LIMIT:

SELECT category_id,
       COUNT(*) AS product_count
FROM products
GROUP BY category_id
ORDER BY product_count DESC
LIMIT 3;

Output:

category_id	product_count
1	4
4	3
2	2

Real-World GROUP BY Scenarios

Sales Report by Month

-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'YYYY-MM') AS month,
       COUNT(*) AS orders,
       SUM(total_amount) AS revenue,
       ROUND(AVG(total_amount), 2) AS avg_order
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM')
ORDER BY month;

Output:

month	orders	revenue	avg_order
2024-01	2	209.97	104.99
2024-02	1	77.49	77.49
2024-03	3	319.98	106.66
2024-04	2	262.98	131.49

Inventory Summary by Category

SELECT c.name AS category,
       COUNT(p.id) AS products,
       SUM(p.stock_quantity) AS total_stock,
       ROUND(SUM(p.price * p.stock_quantity), 2) AS inventory_value
FROM categories c
JOIN products p ON c.id = p.category_id
GROUP BY c.name
ORDER BY inventory_value DESC;

Output:

category	products	total_stock	inventory_value
Electronics	4	425	20247.75
Sports	3	340	14898.20
Home & Kitchen	1	25	3249.75
Books	2	80	3024.50

Price Distribution Analysis

SELECT CASE
           WHEN price < 30 THEN 'Under $30'
           WHEN price < 50 THEN '$30 - $49.99'
           WHEN price < 100 THEN '$50 - $99.99'
           ELSE '$100 and above'
       END AS price_range,
       COUNT(*) AS product_count,
       ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY CASE
             WHEN price < 30 THEN 'Under $30'
             WHEN price < 50 THEN '$30 - $49.99'
             WHEN price < 100 THEN '$50 - $99.99'
             ELSE '$100 and above'
         END
ORDER BY avg_price;

Output:

price_range	product_count	avg_price
Under $30	2	27.49
$30 - $49.99	4	40.12
$50 - $99.99	2	77.50
$100 and above	2	120.00

Grouping by Expressions

Notice that the GROUP BY clause contains the same CASE expression as the SELECT list. When you group by an expression, you must repeat the full expression in GROUP BY. Some databases (MySQL, PostgreSQL) allow you to reference the alias instead, but repeating the expression is the most portable approach.

-- PostgreSQL / MySQL: Can use alias in GROUP BY
GROUP BY price_range

-- ANSI standard / SQL Server: Must repeat the expression
GROUP BY CASE WHEN price < 30 THEN 'Under $30' ... END

Order Activity by Day of Week

-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'Day') AS day_of_week,
       COUNT(*) AS order_count,
       SUM(total_amount) AS revenue
FROM orders
GROUP BY TO_CHAR(order_date, 'Day'), EXTRACT(DOW FROM order_date)
ORDER BY EXTRACT(DOW FROM order_date);

This kind of query helps businesses understand their busiest days.

Common GROUP BY Mistakes

Mistake 1: Forgetting GROUP BY with Aggregates

-- Wrong: Aggregate mixed with non-aggregated column, no GROUP BY
SELECT category_id, COUNT(*)
FROM products;
-- ERROR: column "products.category_id" must appear in the GROUP BY clause

Fix:

SELECT category_id, COUNT(*)
FROM products
GROUP BY category_id;

Mistake 2: Using WHERE Instead of HAVING to Filter Groups

-- Wrong: WHERE cannot filter on aggregate results
SELECT category_id, COUNT(*) AS product_count
FROM products
WHERE COUNT(*) > 2
GROUP BY category_id;
-- ERROR: aggregate functions are not allowed in WHERE

WHERE filters individual rows before grouping. To filter after grouping based on aggregate results, you need HAVING (covered in the next guide). For now, here is a preview:

-- Correct: HAVING filters groups after aggregation
SELECT category_id, COUNT(*) AS product_count
FROM products
GROUP BY category_id
HAVING COUNT(*) > 2;

Mistake 3: Selecting Too Many Columns

-- Wrong: price, name, and stock_quantity are not in GROUP BY
SELECT category_id, name, price, stock_quantity, COUNT(*)
FROM products
GROUP BY category_id;
-- ERROR

Fix: Only select what you can aggregate or group:

SELECT category_id,
       COUNT(*) AS product_count,
       MIN(price) AS cheapest,
       MAX(price) AS most_expensive,
       SUM(stock_quantity) AS total_stock
FROM products
GROUP BY category_id;

Output:

category_id	product_count	cheapest	most_expensive	total_stock
1	4	29.99	89.99	425
2	2	34.99	42.50	80
3	1	129.99	129.99	25
4	3	24.99	110.00	340

Mistake 4: Grouping Defeats the Purpose

-- Technically valid, but pointless: every row is its own group
SELECT id, name, price, COUNT(*) AS count
FROM products
GROUP BY id, name, price;

If you group by the primary key (or all columns), every row becomes its own group and COUNT(*) is always 1. The grouping adds nothing. Ask yourself: "Am I actually summarizing anything?"

Practical Exercises

Exercise 1

Count the number of customers in each city (exclude customers with unknown cities).

SELECT city, COUNT(*) AS customer_count
FROM customers
WHERE city IS NOT NULL
GROUP BY city
ORDER BY customer_count DESC;

Expected output:

city	customer_count
New York	2
Chicago	2
Los Angeles	1
Seattle	1

Exercise 2

Find the total revenue and average order value for each order status.

SELECT status,
       SUM(total_amount) AS total_revenue,
       ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;

Expected output:

status	total_revenue	avg_order_value
pending	297.96999999999997	99.32
completed	287.46	95.82
shipped	284.99	142.50

Exercise 3

Find the cheapest and most expensive product in each category, along with the category name.

SELECT c.name AS category,
       MIN(p.price) AS cheapest,
       MAX(p.price) AS most_expensive,
       MAX(p.price) - MIN(p.price) AS price_range
FROM products p
JOIN categories c ON p.category_id = c.id
GROUP BY c.name
ORDER BY price_range DESC;

Expected output:

category	cheapest	most_expensive	price_range
Sports	24.99	110.00	85.01
Electronics	29.99	89.99	60.00
Books	34.99	42.50	7.509999999999998
Home & Kitchen	129.99	129.99	0.00

Exercise 4

Count how many orders each customer has placed, showing the customer name. Include customers with no orders.

SELECT c.first_name || ' ' || c.last_name AS customer,
       COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.first_name, c.last_name
ORDER BY order_count DESC, customer;

Expected output:

customer	order_count
Alice Johnson	2
Bob Martinez	2
Carol Singh	1
David Chen	1
Eva Brown	1
Frank Wilson	1
Grace Taylor	0

Exercise 5

Find the total quantity of items ordered for each product, and show only products that have been ordered.

SELECT p.name AS product,
       SUM(oi.quantity) AS total_ordered,
       SUM(oi.quantity * oi.unit_price) AS total_revenue
FROM order_items oi
JOIN products p ON oi.product_id = p.id
GROUP BY p.id, p.name
ORDER BY total_revenue DESC;

Expected output:

product	total_ordered	total_revenue
Running Shoes X1	2	220.00
Mechanical Keyboard	2	179.98
Coffee Maker Pro	1	129.99
USB-C Hub	2	90.00
SQL for Beginners	2	69.98
Wireless Mouse	2	59.98
Stainless Water Bottle	2	49.98
Yoga Mat Premium	1	38.00

Key Takeaways

GROUP BY transforms aggregate functions from whole-table summaries into per-group summaries, enabling the analytical queries that power reports, dashboards, and business intelligence. Here is what you should remember:

GROUP BY divides rows into groups based on one or more columns, then applies aggregate functions to each group separately
Each unique value (or combination of values) in the GROUP BY columns creates one group and produces one output row
Every column in your SELECT list must either appear in GROUP BY or be inside an aggregate function. No exceptions in standard SQL.
GROUP BY with multiple columns creates groups based on each unique combination of values
WHERE filters rows before grouping. Use it to exclude individual rows
HAVING (next guide) filters groups after aggregation. Use it to exclude groups based on aggregate results
The processing order is: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
NULL values form their own group in GROUP BY
Combine GROUP BY with JOIN to display meaningful names instead of raw IDs
Combine with ORDER BY to rank groups by their aggregate values
Combine with LIMIT to get "top N" groups
MySQL may silently allow non-aggregated columns without GROUP BY, returning arbitrary values. Enable ONLY_FULL_GROUP_BY mode to prevent this

GROUP BY is where SQL transitions from simple data retrieval into genuine data analysis. Combined with aggregate functions, it lets you answer complex business questions in a single query. The next guide covers HAVING, which adds the ability to filter groups after aggregation, completing your analytical toolkit.

How GROUP BY Works Conceptually​

The Mental Model​

Grouping by One Column​

Products Per Category​

Orders Per Status​

Revenue Per Status​

Orders Per Customer​

Customers Per City​

What Happens to NULL in GROUP BY?​

Grouping by Multiple Columns​

Orders Per Customer Per Status​

Products Per Category and Availability​

Orders Per Month and Status​

Adding Context with JOINs​

Product Counts with Category Names​

Customer Order Summary with Names​

The Most Common GROUP BY Error​

The Error​

Why It Fails​

The Rule​

How to Fix It​

Multiple Wrong Examples and Their Fixes​

GROUP BY with WHERE​

Revenue from Completed Orders Per Customer​

The Processing Order​

GROUP BY with ORDER BY​

Categories Ranked by Average Price​

Customers Ranked by Total Spending​

Top 3 Categories by Product Count​

Real-World GROUP BY Scenarios​

Sales Report by Month​

Inventory Summary by Category​

Price Distribution Analysis​

Order Activity by Day of Week​

Common GROUP BY Mistakes​

Mistake 1: Forgetting GROUP BY with Aggregates​

Mistake 2: Using WHERE Instead of HAVING to Filter Groups​

Mistake 3: Selecting Too Many Columns​

Mistake 4: Grouping Defeats the Purpose​

Practical Exercises​

Exercise 1​

Exercise 2​

Exercise 3​

Exercise 4​

Exercise 5​

Key Takeaways​

Table of Contents

How GROUP BY Works Conceptually

The Mental Model

Grouping by One Column

Products Per Category

Orders Per Status

Revenue Per Status

Orders Per Customer

Customers Per City

What Happens to NULL in GROUP BY?

Grouping by Multiple Columns

Orders Per Customer Per Status

Products Per Category and Availability

Orders Per Month and Status

Adding Context with JOINs

Product Counts with Category Names

Customer Order Summary with Names

The Most Common GROUP BY Error

The Error

Why It Fails

The Rule

How to Fix It

Multiple Wrong Examples and Their Fixes

GROUP BY with WHERE

Revenue from Completed Orders Per Customer

The Processing Order

GROUP BY with ORDER BY

Categories Ranked by Average Price

Customers Ranked by Total Spending

Top 3 Categories by Product Count

Real-World GROUP BY Scenarios

Sales Report by Month

Inventory Summary by Category

Price Distribution Analysis

Order Activity by Day of Week

Common GROUP BY Mistakes

Mistake 1: Forgetting GROUP BY with Aggregates

Mistake 2: Using WHERE Instead of HAVING to Filter Groups

Mistake 3: Selecting Too Many Columns

Mistake 4: Grouping Defeats the Purpose

Practical Exercises

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Key Takeaways