SQL GROUP BY for Grouping Results
In the previous guide, you learned how aggregate functions condense an entire table into a single summary row. COUNT(*) tells you how many total products you have. AVG(price) gives you one average across all products. But what if you need the average price per category? Or the number of orders per customer? Or the total revenue per month?
These questions require the database to split the data into groups first, then aggregate each group separately. This is exactly what GROUP BY does. It is one of the most powerful clauses in SQL and the foundation of every meaningful report, dashboard, and analytical query.
This guide covers how GROUP BY works conceptually, how to group by one or multiple columns, the critical error that every beginner encounters (selecting non-aggregated columns), and practical real-world examples. Everything uses the ShopSmart sample database with full outputs (we defined it in a previous guide here).
How GROUP BY Works Conceptually
Without GROUP BY, aggregate functions process all matching rows as one big group and return a single result:
-- One group, one result: the average price across ALL products
SELECT AVG(price) AS avg_price
FROM products;
| avg_price |
|---|
| 61.045 |
With GROUP BY, the database first divides the rows into groups based on the values in one or more columns, then applies the aggregate function to each group separately, returning one result row per group:
-- Multiple groups, one result per group: average price PER category
SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;
| category_id | avg_price |
|---|---|
| 1 | 57.495 |
| 2 | 38.745000000000005 |
| 3 | 129.99 |
| 4 | 57.663333333333334 |
Instead of one average for all products, you now have a separate average for each category.
The Mental Model
Think of GROUP BY as a three-step process:
Step 1: Sort rows into groups based on the GROUP BY column:
| Group (category_id = 1) | Group (category_id = 2) | Group (category_id = 3) | Group (category_id = 4) |
|---|---|---|---|
| Wireless Mouse ($29.99) | SQL for Beginners ($34.99) | Coffee Maker Pro ($129.99) | Yoga Mat Premium ($38.00) |
| Mechanical Keyboard ($89.99) | Data Science Handbook ($42.50) | Running Shoes X1 ($110.00) | |
| USB-C Hub ($45.00) | Stainless Water Bottle ($24.99) | ||
| Bluetooth Speaker ($65.00) |
Step 2: Apply the aggregate function to each group independently:
- Category 1: AVG(29.99, 89.99, 45.00, 65.00) = 57.50
- Category 2: AVG(34.99, 42.50) = 38.75
- Category 3: AVG(129.99) = 129.99
- Category 4: AVG(38.00, 110.00, 24.99) = 57.66
Step 3: Return one row per group with the group identifier and the aggregated value.
GROUP BY does not change which rows are included. It changes how the rows are organized before aggregation. Without it, all rows form one group. With it, rows are split into multiple groups, and each group gets its own summary.
Grouping by One Column
The most common pattern is grouping by a single column to get summaries per category, per status, per customer, or per any other dimension.
Products Per Category
SELECT category_id, COUNT(*) AS product_count
FROM products
GROUP BY category_id
ORDER BY product_count DESC;
Output:
| category_id | product_count |
|---|---|
| 1 | 4 |
| 4 | 3 |
| 2 | 2 |
| 3 | 1 |
Electronics (category 1) has the most products. Home & Kitchen (category 3) has only one.
Orders Per Status
SELECT status, COUNT(*) AS order_count
FROM orders
GROUP BY status
ORDER BY order_count DESC;
Output:
| status | order_count |
|---|---|
| completed | 3 |
| pending | 3 |
| shipped | 2 |
Revenue Per Status
SELECT status,
COUNT(*) AS order_count,
SUM(total_amount) AS total_revenue,
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;
Output:
| status | order_count | total_revenue | avg_order_value |
|---|---|---|---|
| pending | 3 | 297.96999999999997 | 99.32 |
| completed | 3 | 287.46 | 95.82 |
| shipped | 2 | 284.99 | 142.50 |
Each status group gets its own count, total, and average. This single query could power an order management dashboard.
Orders Per Customer
SELECT customer_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;
Output:
| customer_id | order_count | total_spent |
|---|---|---|
| 6 | 1 | 199.98 |
| 1 | 2 | 197.47 |
| 4 | 1 | 155.00 |
| 2 | 2 | 152.99 |
| 3 | 1 | 129.99 |
| 5 | 1 | 34.99 |
Customer 1 (Alice) has spent the most overall with 2 orders totaling $197.47. Customer 6 (Frank) has the highest single order at $199.98.
Customers Per City
SELECT city, COUNT(*) AS customer_count
FROM customers
WHERE city IS NOT NULL
GROUP BY city
ORDER BY customer_count DESC;
Output:
| city | customer_count |
|---|---|
| New York | 2 |
| Chicago | 2 |
| Los Angeles | 1 |
| Seattle | 1 |
Notice the WHERE city IS NOT NULL filter. Without it, NULL would form its own group. Whether that is desirable depends on your use case.
What Happens to NULL in GROUP BY?
Unlike most SQL operations where NULL creates confusion, GROUP BY treats all NULL values as belonging to the same group:
-- NULL forms its own group
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city
ORDER BY customer_count DESC;
Output:
| city | customer_count |
|---|---|
| New York | 2 |
| Chicago | 2 |
| Los Angeles | 1 |
| Seattle | 1 |
| NULL | 1 |
Grace's NULL city creates a group of its own. This is consistent with how DISTINCT treats NULL values.
Grouping by Multiple Columns
You can group by two or more columns to create more specific groupings. The database creates a separate group for each unique combination of values across all grouped columns.
Orders Per Customer Per Status
SELECT customer_id, status, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id, status
ORDER BY customer_id, status;
Output:
| customer_id | status | order_count |
|---|---|---|
| 1 | completed | 2 |
| 2 | completed | 1 |
| 2 | pending | 1 |
| 3 | shipped | 1 |
| 4 | shipped | 1 |
| 5 | pending | 1 |
| 6 | pending | 1 |
Customer 2 (Bob) appears in two rows because he has orders in two different statuses. Each unique combination of customer_id and status gets its own group.
Compare this with the single-column grouping:
-- Single-column: 6 groups (one per customer)
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;
-- Returns 6 rows
-- Multi-column: 7 groups (one per customer-status combination)
SELECT customer_id, status, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id, status;
-- Returns 7 rows (Bob has 2 rows now)
Products Per Category and Availability
SELECT category_id,
is_available,
COUNT(*) AS product_count,
ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id, is_available
ORDER BY category_id, is_available DESC;
Output:
| category_id | is_available | product_count | avg_price |
|---|---|---|---|
| 1 | true | 3 | 54.99 |
| 1 | false | 1 | 65.00 |
| 2 | true | 2 | 38.75 |
| 3 | true | 1 | 129.99 |
| 4 | true | 3 | 57.66 |
Category 1 (Electronics) splits into two groups: 3 available products averaging $54.99 and 1 unavailable product at $65.00 (the Bluetooth Speaker).
Orders Per Month and Status
-- Extract year-month from order dates for monthly grouping
-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'YYYY-MM') AS order_month,
status,
COUNT(*) AS order_count,
SUM(total_amount) AS revenue
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM'), status
ORDER BY order_month, status;
Output:
| order_month | status | order_count | revenue |
|---|---|---|---|
| 2024-01 | completed | 2 | 209.97 |
| 2024-02 | completed | 1 | 77.49 |
| 2024-03 | pending | 1 | 34.99 |
| 2024-03 | shipped | 2 | 284.99 |
| 2024-04 | pending | 2 | 262.98 |
The function to extract a formatted date string differs across databases:
-- PostgreSQL
TO_CHAR(order_date, 'YYYY-MM')
-- MySQL
DATE_FORMAT(order_date, '%Y-%m')
-- SQLite
STRFTIME('%Y-%m', order_date)
-- SQL Server
FORMAT(order_date, 'yyyy-MM')
The concept is the same in every database. Only the function name and format string differ.
Adding Context with JOINs
Grouping by category_id or customer_id produces correct numbers, but raw IDs are not very readable. Joining with related tables lets you display names alongside the aggregated data.
Product Counts with Category Names
SELECT c.name AS category,
COUNT(p.id) AS product_count,
ROUND(AVG(p.price), 2) AS avg_price,
MIN(p.price) AS cheapest,
MAX(p.price) AS most_expensive
FROM categories c
LEFT JOIN products p ON c.id = p.category_id
GROUP BY c.name
ORDER BY product_count DESC;
Output:
| category | product_count | avg_price | cheapest | most_expensive |
|---|---|---|---|---|
| Electronics | 4 | 57.50 | 29.99 | 89.99 |
| Sports | 3 | 57.66 | 24.99 | 110.00 |
| Books | 2 | 38.75 | 34.99 | 42.50 |
| Home & Kitchen | 1 | 129.99 | 129.99 | 129.99 |
Now the report shows "Electronics" instead of "1", which is far more useful for anyone reading it.
Customer Order Summary with Names
SELECT c.first_name || ' ' || c.last_name AS customer,
c.city,
COUNT(o.id) AS order_count,
COALESCE(SUM(o.total_amount), 0) AS total_spent,
COALESCE(ROUND(AVG(o.total_amount), 2), 0) AS avg_order_value
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.first_name, c.last_name, c.city
ORDER BY total_spent DESC;
Output:
| customer | city | order_count | total_spent | avg_order_value |
|---|---|---|---|---|
| Frank Wilson | Chicago | 1 | 199.98 | 199.98 |
| Alice Johnson | New York | 2 | 197.47 | 98.73 |
| David Chen | New York | 1 | 155.00 | 155.00 |
| Bob Martinez | Los Angeles | 2 | 152.99 | 76.50 |
| Carol Singh | Chicago | 1 | 129.99 | 129.99 |
| Eva Brown | Seattle | 1 | 34.99 | 34.99 |
| Grace Taylor | NULL | 0 | 0 | 0 |
The LEFT JOIN ensures Grace appears even though she has no orders. COALESCE converts her NULL totals to 0.
When grouping by customer, include the primary key (c.id) in the GROUP BY clause. This guarantees each customer gets their own group, even if two customers share the same name. The additional columns (first_name, last_name, city) are functionally dependent on c.id, so including them is safe and necessary for display.
The Most Common GROUP BY Error
This is the single most important section of this guide. Almost every SQL beginner encounters this error, and understanding why it happens will permanently improve your SQL writing.
The Error
-- This query FAILS in most databases
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;
Error message (PostgreSQL):
ERROR: column "products.name" must appear in the GROUP BY clause
or be used in an aggregate function
Error message (SQL Server):
Column 'products.name' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.
Why It Fails
Let us trace through the logic. When you write GROUP BY category_id, the database creates one group per category:
| Group: category_id = 1 | Group: category_id = 2 |
|---|---|
| Wireless Mouse | SQL for Beginners |
| Mechanical Keyboard | Data Science Handbook |
| USB-C Hub | |
| Bluetooth Speaker |
For category 1, AVG(price) produces one number: 57.50. But name has four different values in this group. Which name should the database display? "Wireless Mouse"? "Mechanical Keyboard"? There is no logical answer. The database refuses to guess and raises an error instead.
The Rule
Every column in your SELECT list must either:
- Appear in the
GROUP BYclause, or - Be used inside an aggregate function (
COUNT,SUM,AVG,MIN,MAX)
No exceptions (with one caveat for MySQL, discussed below).
How to Fix It
There are several valid approaches depending on what you actually need:
Fix 1: Remove the non-aggregated column
-- If you just want the average price per category, you don't need the name
SELECT category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;
Fix 2: Add the column to GROUP BY
-- This creates a group for each unique (category_id, name) combination
-- Probably NOT what you want, as each product becomes its own group
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id, name;
Each product is now its own group, and the "average" is just each product's own price. This is technically valid but defeats the purpose of aggregation.
Fix 3: Aggregate the column
-- Use MIN or MAX on the name column if you want a representative name
SELECT MIN(name) AS first_product_name,
category_id,
ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id;
Fix 4: Use a JOIN or subquery to bring in non-grouped data
This is the most robust approach when you need detailed information alongside grouped summaries, but it requires more advanced techniques covered in future guides.
Multiple Wrong Examples and Their Fixes
-- Wrong: 'email' is not grouped or aggregated
SELECT city, email, COUNT(*) AS customer_count
FROM customers
GROUP BY city;
-- ERROR
-- Fix: Remove email or aggregate it
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city;
-- Wrong: 'order_date' is not grouped or aggregated
SELECT customer_id, order_date, SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id;
-- ERROR
-- Fix option 1: Add order_date to GROUP BY (groups by customer AND date)
SELECT customer_id, order_date, SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id, order_date;
-- Fix option 2: Aggregate order_date
SELECT customer_id,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id;
MySQL with the default sql_mode setting may not raise an error when you select non-aggregated columns without grouping them. Instead, it silently returns an arbitrary value from the group.
-- MySQL (default settings): No error, but WRONG results
SELECT name, category_id, AVG(price) AS avg_price
FROM products
GROUP BY category_id;
-- Returns one row per category, but the 'name' value is random/unpredictable
This is not a feature; it is a trap. The name value you see has no logical relationship to the average. Enable ONLY_FULL_GROUP_BY mode in MySQL to enforce the standard behavior:
SET sql_mode = 'ONLY_FULL_GROUP_BY';
With this mode enabled, MySQL behaves like PostgreSQL and SQL Server, rejecting queries with non-aggregated columns not in GROUP BY.
GROUP BY with WHERE
WHERE filters rows before grouping. The database removes rows that do not match the WHERE condition, then groups the remaining rows.
-- Count available products per category
SELECT category_id, COUNT(*) AS available_count
FROM products
WHERE is_available = true
GROUP BY category_id
ORDER BY available_count DESC;
Output:
| category_id | available_count |
|---|---|
| 4 | 3 |
| 1 | 3 |
| 2 | 2 |
| 3 | 1 |
Compare this with the unfiltered count:
-- Count ALL products per category (including unavailable)
SELECT category_id, COUNT(*) AS total_count
FROM products
GROUP BY category_id
ORDER BY total_count DESC;
| category_id | total_count |
|---|---|
| 1 | 4 |
| 4 | 3 |
| 2 | 2 |
| 3 | 1 |
Category 1 drops from 4 to 3 because the Bluetooth Speaker (unavailable) is excluded by WHERE.
Revenue from Completed Orders Per Customer
SELECT customer_id,
COUNT(*) AS completed_orders,
SUM(total_amount) AS completed_revenue
FROM orders
WHERE status = 'completed'
GROUP BY customer_id
ORDER BY completed_revenue DESC;
Output:
| customer_id | completed_orders | completed_revenue |
|---|---|---|
| 1 | 2 | 197.47 |
| 2 | 1 | 89.99 |
Only Alice and Bob have completed orders.
The Processing Order
Understanding the order in which SQL processes clauses is critical when using GROUP BY:
| Step | Clause | What Happens |
|---|---|---|
| 1 | FROM | Identify the source table(s) |
| 2 | WHERE | Filter individual rows |
| 3 | GROUP BY | Divide remaining rows into groups |
| 4 | HAVING | Filter groups (covered in the next guide) |
| 5 | SELECT | Calculate expressions, apply aliases |
| 6 | ORDER BY | Sort the final results |
| 7 | LIMIT | Restrict the number of rows returned |
This order explains several important behaviors:
WHEREcannot use aggregate functions (aggregation has not happened yet at step 2)WHEREcannot use column aliases (aliases are not created until step 5)ORDER BYcan use aliases (it runs afterSELECT)GROUP BYcannot use column aliases in most databases (it runs beforeSELECT)
GROUP BY with ORDER BY
You can sort grouped results just like any other query. Sorting by the aggregate result is particularly useful:
Categories Ranked by Average Price
SELECT category_id,
ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category_id
ORDER BY avg_price DESC;
Output:
| category_id | avg_price |
|---|---|
| 3 | 129.99 |
| 4 | 57.66 |
| 1 | 57.49 |
| 2 | 38.75 |
Customers Ranked by Total Spending
SELECT customer_id,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;
Output:
| customer_id | total_spent |
|---|---|
| 6 | 199.98 |
| 1 | 197.47 |
| 4 | 155.00 |
| 2 | 152.99 |
| 3 | 129.99 |
| 5 | 34.99 |
Top 3 Categories by Product Count
Combining GROUP BY, ORDER BY, and LIMIT:
SELECT category_id,
COUNT(*) AS product_count
FROM products
GROUP BY category_id
ORDER BY product_count DESC
LIMIT 3;
Output:
| category_id | product_count |
|---|---|
| 1 | 4 |
| 4 | 3 |
| 2 | 2 |
Real-World GROUP BY Scenarios
Sales Report by Month
-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'YYYY-MM') AS month,
COUNT(*) AS orders,
SUM(total_amount) AS revenue,
ROUND(AVG(total_amount), 2) AS avg_order
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM')
ORDER BY month;
Output:
| month | orders | revenue | avg_order |
|---|---|---|---|
| 2024-01 | 2 | 209.97 | 104.99 |
| 2024-02 | 1 | 77.49 | 77.49 |
| 2024-03 | 3 | 319.98 | 106.66 |
| 2024-04 | 2 | 262.98 | 131.49 |
Inventory Summary by Category
SELECT c.name AS category,
COUNT(p.id) AS products,
SUM(p.stock_quantity) AS total_stock,
ROUND(SUM(p.price * p.stock_quantity), 2) AS inventory_value
FROM categories c
JOIN products p ON c.id = p.category_id
GROUP BY c.name
ORDER BY inventory_value DESC;
Output:
| category | products | total_stock | inventory_value |
|---|---|---|---|
| Electronics | 4 | 425 | 20247.75 |
| Sports | 3 | 340 | 14898.20 |
| Home & Kitchen | 1 | 25 | 3249.75 |
| Books | 2 | 80 | 3024.50 |
Price Distribution Analysis
SELECT CASE
WHEN price < 30 THEN 'Under $30'
WHEN price < 50 THEN '$30 - $49.99'
WHEN price < 100 THEN '$50 - $99.99'
ELSE '$100 and above'
END AS price_range,
COUNT(*) AS product_count,
ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY CASE
WHEN price < 30 THEN 'Under $30'
WHEN price < 50 THEN '$30 - $49.99'
WHEN price < 100 THEN '$50 - $99.99'
ELSE '$100 and above'
END
ORDER BY avg_price;
Output:
| price_range | product_count | avg_price |
|---|---|---|
| Under $30 | 2 | 27.49 |
| $30 - $49.99 | 4 | 40.12 |
| $50 - $99.99 | 2 | 77.50 |
| $100 and above | 2 | 120.00 |
Notice that the GROUP BY clause contains the same CASE expression as the SELECT list. When you group by an expression, you must repeat the full expression in GROUP BY. Some databases (MySQL, PostgreSQL) allow you to reference the alias instead, but repeating the expression is the most portable approach.
-- PostgreSQL / MySQL: Can use alias in GROUP BY
GROUP BY price_range
-- ANSI standard / SQL Server: Must repeat the expression
GROUP BY CASE WHEN price < 30 THEN 'Under $30' ... END
Order Activity by Day of Week
-- PostgreSQL syntax
SELECT TO_CHAR(order_date, 'Day') AS day_of_week,
COUNT(*) AS order_count,
SUM(total_amount) AS revenue
FROM orders
GROUP BY TO_CHAR(order_date, 'Day'), EXTRACT(DOW FROM order_date)
ORDER BY EXTRACT(DOW FROM order_date);
This kind of query helps businesses understand their busiest days.
Common GROUP BY Mistakes
Mistake 1: Forgetting GROUP BY with Aggregates
-- Wrong: Aggregate mixed with non-aggregated column, no GROUP BY
SELECT category_id, COUNT(*)
FROM products;
-- ERROR: column "products.category_id" must appear in the GROUP BY clause
Fix:
SELECT category_id, COUNT(*)
FROM products
GROUP BY category_id;
Mistake 2: Using WHERE Instead of HAVING to Filter Groups
-- Wrong: WHERE cannot filter on aggregate results
SELECT category_id, COUNT(*) AS product_count
FROM products
WHERE COUNT(*) > 2
GROUP BY category_id;
-- ERROR: aggregate functions are not allowed in WHERE
WHERE filters individual rows before grouping. To filter after grouping based on aggregate results, you need HAVING (covered in the next guide). For now, here is a preview:
-- Correct: HAVING filters groups after aggregation
SELECT category_id, COUNT(*) AS product_count
FROM products
GROUP BY category_id
HAVING COUNT(*) > 2;
Mistake 3: Selecting Too Many Columns
-- Wrong: price, name, and stock_quantity are not in GROUP BY
SELECT category_id, name, price, stock_quantity, COUNT(*)
FROM products
GROUP BY category_id;
-- ERROR
Fix: Only select what you can aggregate or group:
SELECT category_id,
COUNT(*) AS product_count,
MIN(price) AS cheapest,
MAX(price) AS most_expensive,
SUM(stock_quantity) AS total_stock
FROM products
GROUP BY category_id;
Output:
| category_id | product_count | cheapest | most_expensive | total_stock |
|---|---|---|---|---|
| 1 | 4 | 29.99 | 89.99 | 425 |
| 2 | 2 | 34.99 | 42.50 | 80 |
| 3 | 1 | 129.99 | 129.99 | 25 |
| 4 | 3 | 24.99 | 110.00 | 340 |
Mistake 4: Grouping Defeats the Purpose
-- Technically valid, but pointless: every row is its own group
SELECT id, name, price, COUNT(*) AS count
FROM products
GROUP BY id, name, price;
If you group by the primary key (or all columns), every row becomes its own group and COUNT(*) is always 1. The grouping adds nothing. Ask yourself: "Am I actually summarizing anything?"
Practical Exercises
Exercise 1
Count the number of customers in each city (exclude customers with unknown cities).
SELECT city, COUNT(*) AS customer_count
FROM customers
WHERE city IS NOT NULL
GROUP BY city
ORDER BY customer_count DESC;
Expected output:
| city | customer_count |
|---|---|
| New York | 2 |
| Chicago | 2 |
| Los Angeles | 1 |
| Seattle | 1 |
Exercise 2
Find the total revenue and average order value for each order status.
SELECT status,
SUM(total_amount) AS total_revenue,
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;
Expected output:
| status | total_revenue | avg_order_value |
|---|---|---|
| pending | 297.96999999999997 | 99.32 |
| completed | 287.46 | 95.82 |
| shipped | 284.99 | 142.50 |
Exercise 3
Find the cheapest and most expensive product in each category, along with the category name.
SELECT c.name AS category,
MIN(p.price) AS cheapest,
MAX(p.price) AS most_expensive,
MAX(p.price) - MIN(p.price) AS price_range
FROM products p
JOIN categories c ON p.category_id = c.id
GROUP BY c.name
ORDER BY price_range DESC;
Expected output:
| category | cheapest | most_expensive | price_range |
|---|---|---|---|
| Sports | 24.99 | 110.00 | 85.01 |
| Electronics | 29.99 | 89.99 | 60.00 |
| Books | 34.99 | 42.50 | 7.509999999999998 |
| Home & Kitchen | 129.99 | 129.99 | 0.00 |
Exercise 4
Count how many orders each customer has placed, showing the customer name. Include customers with no orders.
SELECT c.first_name || ' ' || c.last_name AS customer,
COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.first_name, c.last_name
ORDER BY order_count DESC, customer;
Expected output:
| customer | order_count |
|---|---|
| Alice Johnson | 2 |
| Bob Martinez | 2 |
| Carol Singh | 1 |
| David Chen | 1 |
| Eva Brown | 1 |
| Frank Wilson | 1 |
| Grace Taylor | 0 |
Exercise 5
Find the total quantity of items ordered for each product, and show only products that have been ordered.
SELECT p.name AS product,
SUM(oi.quantity) AS total_ordered,
SUM(oi.quantity * oi.unit_price) AS total_revenue
FROM order_items oi
JOIN products p ON oi.product_id = p.id
GROUP BY p.id, p.name
ORDER BY total_revenue DESC;
Expected output:
| product | total_ordered | total_revenue |
|---|---|---|
| Running Shoes X1 | 2 | 220.00 |
| Mechanical Keyboard | 2 | 179.98 |
| Coffee Maker Pro | 1 | 129.99 |
| USB-C Hub | 2 | 90.00 |
| SQL for Beginners | 2 | 69.98 |
| Wireless Mouse | 2 | 59.98 |
| Stainless Water Bottle | 2 | 49.98 |
| Yoga Mat Premium | 1 | 38.00 |
Key Takeaways
GROUP BY transforms aggregate functions from whole-table summaries into per-group summaries, enabling the analytical queries that power reports, dashboards, and business intelligence. Here is what you should remember:
GROUP BYdivides rows into groups based on one or more columns, then applies aggregate functions to each group separately- Each unique value (or combination of values) in the
GROUP BYcolumns creates one group and produces one output row - Every column in your
SELECTlist must either appear inGROUP BYor be inside an aggregate function. No exceptions in standard SQL. GROUP BYwith multiple columns creates groups based on each unique combination of valuesWHEREfilters rows before grouping. Use it to exclude individual rowsHAVING(next guide) filters groups after aggregation. Use it to exclude groups based on aggregate results- The processing order is:
FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT NULLvalues form their own group inGROUP BY- Combine
GROUP BYwithJOINto display meaningful names instead of raw IDs - Combine with
ORDER BYto rank groups by their aggregate values - Combine with
LIMITto get "top N" groups - MySQL may silently allow non-aggregated columns without
GROUP BY, returning arbitrary values. EnableONLY_FULL_GROUP_BYmode to prevent this
GROUP BY is where SQL transitions from simple data retrieval into genuine data analysis. Combined with aggregate functions, it lets you answer complex business questions in a single query. The next guide covers HAVING, which adds the ability to filter groups after aggregation, completing your analytical toolkit.