Subqueries in SELECT and FROM Clauses
Most developers first encounter subqueries inside a WHERE clause, filtering rows based on a computed value or a list. But subqueries are far more versatile than that. You can place a subquery directly in the SELECT list to create computed columns, or in the FROM clause to build temporary result sets that the outer query treats like regular tables.
Understanding how to write an SQL subquery in SELECT and FROM clauses unlocks a new level of flexibility. You can attach per-row calculations to your results without collapsing data with GROUP BY, build intermediate data layers that simplify complex logic, and create self-contained, readable queries that would otherwise require multiple steps or temporary tables.
This guide covers both placements in depth, with practical examples, full outputs, common mistakes, and clear guidance on when each approach is the right choice.
The Sample Schema
Every example in this guide uses the following schema:
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100)
);
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
department_id INT,
salary DECIMAL(10,2),
hire_date DATE
);
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
price DECIMAL(10,2)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10,2)
);
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(50)
);
-- departments
INSERT INTO departments VALUES (1, 'Engineering');
INSERT INTO departments VALUES (2, 'Marketing');
INSERT INTO departments VALUES (3, 'Sales');
-- employees
INSERT INTO employees VALUES (101, 'Alice', 1, 95000, '2020-03-15');
INSERT INTO employees VALUES (102, 'Bob', 1, 82000, '2021-07-01');
INSERT INTO employees VALUES (103, 'Carol', 1, 78000, '2022-01-10');
INSERT INTO employees VALUES (104, 'Dave', 2, 68000, '2019-11-20');
INSERT INTO employees VALUES (105, 'Eve', 2, 72000, '2021-05-05');
INSERT INTO employees VALUES (106, 'Frank', 3, 61000, '2023-02-14');
INSERT INTO employees VALUES (107, 'Grace', 3, 59000, '2022-08-30');
INSERT INTO employees VALUES (108, 'Hank', 3, 63000, '2020-06-01');
-- products
INSERT INTO products VALUES (201, 'Wireless Mouse', 'Electronics', 25.99);
INSERT INTO products VALUES (202, 'USB-C Cable', 'Electronics', 9.99);
INSERT INTO products VALUES (203, 'Notebook A5', 'Stationery', 4.50);
INSERT INTO products VALUES (204, 'Mechanical Keyboard', 'Electronics', 89.99);
INSERT INTO products VALUES (205, 'Desk Lamp', 'Furniture', 34.50);
INSERT INTO products VALUES (206, 'Pen Pack', 'Stationery', 2.99);
INSERT INTO products VALUES (207, 'Standing Desk', 'Furniture', 299.99);
-- customers
INSERT INTO customers VALUES (1, 'Acme Corp', 'New York');
INSERT INTO customers VALUES (2, 'Beta Inc', 'Chicago');
INSERT INTO customers VALUES (3, 'Gamma LLC', 'New York');
INSERT INTO customers VALUES (4, 'Delta Partners', 'Denver');
-- orders
INSERT INTO orders VALUES (1001, 1, '2024-01-15', 500.00);
INSERT INTO orders VALUES (1002, 1, '2024-03-22', 1200.00);
INSERT INTO orders VALUES (1003, 2, '2024-02-10', 300.00);
INSERT INTO orders VALUES (1004, 3, '2024-04-05', 850.00);
INSERT INTO orders VALUES (1005, 1, '2024-05-18', 2100.00);
INSERT INTO orders VALUES (1006, 3, '2024-06-01', 475.00);
Scalar Subqueries in the SELECT Clause
A scalar subquery is a subquery that returns exactly one value: one row, one column. When placed in the SELECT list, it becomes a computed column, calculated for every row in the result set.
Basic Syntax
SELECT
column1,
column2,
(SELECT single_value_expression FROM some_table WHERE condition) AS alias
FROM main_table;
The parentheses wrap the subquery, and the AS alias gives the computed column a name.
Example: Showing a Global Value Next to Every Row
The simplest use case is displaying a constant computed value alongside individual rows. Suppose you want to see each product's price next to the overall average price:
SELECT
product_name,
price,
(SELECT AVG(price) FROM products) AS avg_price
FROM products;
Output:
| product_name | price | avg_price |
|---|---|---|
| Wireless Mouse | 25.99 | 66.85 |
| USB-C Cable | 9.99 | 66.85 |
| Notebook A5 | 4.50 | 66.85 |
| Mechanical Keyboard | 89.99 | 66.85 |
| Desk Lamp | 34.50 | 66.85 |
| Pen Pack | 2.99 | 66.85 |
| Standing Desk | 299.99 | 66.85 |
The subquery SELECT AVG(price) FROM products is not correlated: it produces the same value for every row. The database is smart enough to compute it only once internally, even though it appears once per row in the output.
Example: Computing a Derived Column
You can use the scalar subquery result in arithmetic expressions:
SELECT
product_name,
price,
(SELECT AVG(price) FROM products) AS avg_price,
price - (SELECT AVG(price) FROM products) AS diff_from_avg,
ROUND(price / (SELECT AVG(price) FROM products) * 100, 1) AS pct_of_avg
FROM products
ORDER BY price DESC;
Output:
| product_name | price | avg_price | diff_from_avg | pct_of_avg |
|---|---|---|---|---|
| Standing Desk | 299.99 | 66.85 | 233.14 | 448.7 |
| Mechanical Keyboard | 89.99 | 66.85 | 23.14 | 134.6 |
| Desk Lamp | 34.50 | 66.85 | -32.35 | 51.6 |
| Wireless Mouse | 25.99 | 66.85 | -40.86 | 38.9 |
| USB-C Cable | 9.99 | 66.85 | -56.86 | 14.9 |
| Notebook A5 | 4.50 | 66.85 | -62.35 | 6.7 |
| Pen Pack | 2.99 | 66.85 | -63.86 | 4.5 |
Each row now shows how far above or below the average its price sits, both in absolute terms and as a percentage.
When you reference the same scalar subquery multiple times in one SELECT list, most modern database optimizers detect this and evaluate it only once. However, for clarity and maintainability, consider using a CTE if the same subquery appears more than twice.
Correlated Scalar Subqueries in SELECT
Scalar subqueries become truly powerful when correlated with the outer query. A correlated scalar subquery references a column from the outer row, so its result changes from row to row.
Show each employee alongside the number of colleagues in their department:
SELECT
e.employee_name,
e.department_id,
e.salary,
(
SELECT COUNT(*)
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS dept_headcount
FROM employees AS e
ORDER BY e.department_id;
Output:
| employee_name | department_id | salary | dept_headcount |
|---|---|---|---|
| Alice | 1 | 95000 | 3 |
| Bob | 1 | 82000 | 3 |
| Carol | 1 | 78000 | 3 |
| Dave | 2 | 68000 | 2 |
| Eve | 2 | 72000 | 2 |
| Frank | 3 | 61000 | 3 |
| Grace | 3 | 59000 | 3 |
| Hank | 3 | 63000 | 3 |
The subquery runs logically for each employee, computing the headcount for that employee's department. Engineering and Sales each have 3 people, Marketing has 2.
Show each employee's salary rank within their department:
SELECT
e.employee_name,
e.department_id,
e.salary,
(
SELECT COUNT(*)
FROM employees AS e2
WHERE e2.department_id = e.department_id
AND e2.salary > e.salary
) + 1 AS dept_salary_rank
FROM employees AS e
ORDER BY e.department_id, dept_salary_rank;
Output:
| employee_name | department_id | salary | dept_salary_rank |
|---|---|---|---|
| Alice | 1 | 95000 | 1 |
| Bob | 1 | 82000 | 2 |
| Carol | 1 | 78000 | 3 |
| Eve | 2 | 72000 | 1 |
| Dave | 2 | 68000 | 2 |
| Hank | 3 | 63000 | 1 |
| Frank | 3 | 61000 | 2 |
| Grace | 3 | 59000 | 3 |
The subquery counts how many employees in the same department have a higher salary, then adds 1 to produce a rank. This is a classic technique that predates window functions.
Modern SQL databases support window functions like RANK(), ROW_NUMBER(), and COUNT() OVER(...) that can replace many correlated scalar subqueries with cleaner and often faster syntax. However, understanding the subquery approach is essential because not every environment supports window functions, and the underlying concept translates directly.
The One-Row, One-Column Rule
A scalar subquery in the SELECT clause must return exactly one row and one column. Violating either constraint raises an error.
Wrong (multiple rows):
SELECT
e.employee_name,
(
SELECT e2.salary
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS some_salary
FROM employees AS e;
ERROR: more than one row returned by a subquery used as an expression
The subquery returns all salaries in the department, not a single value. Use an aggregate function to reduce it to one value:
Correct:
SELECT
e.employee_name,
(
SELECT MAX(e2.salary)
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS max_dept_salary
FROM employees AS e;
Wrong (multiple columns):
SELECT
e.employee_name,
(
SELECT AVG(e2.salary), COUNT(*) -- Two columns: ERROR
FROM employees AS e2
WHERE e2.department_id = e.department_id
)
FROM employees AS e;
ERROR: subquery must return only one column
If a scalar subquery returns multiple rows in SQLite, SQLite silently uses the first row of the result and ignores the rest.
Correct: Use separate subqueries for each value:
SELECT
e.employee_name,
(
SELECT AVG(e2.salary)
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS dept_avg_salary,
(
SELECT COUNT(*)
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS dept_headcount
FROM employees AS e;
If a scalar subquery returns zero rows (no matches found), the result is NULL, not an error. This is usually the desired behavior, but be aware of it when performing arithmetic with the result, since any operation involving NULL produces NULL.
-- If a department has no employees, this returns NULL, not 0
(SELECT AVG(e2.salary) FROM employees AS e2 WHERE e2.department_id = 999)
-- Use COALESCE to handle the NULL case
COALESCE(
(SELECT AVG(e2.salary) FROM employees AS e2 WHERE e2.department_id = 999),
0
) AS avg_salary
Subqueries in the FROM Clause (Derived Tables)
A subquery placed in the FROM clause creates a derived table (also called an inline view). The database executes the subquery first, materializes the result as a temporary result set, and then the outer query treats it like any other table: you can select from it, filter it, join it with other tables, and even join it with other derived tables.
Basic Syntax
SELECT dt.column1, dt.column2
FROM (
SELECT column1, column2
FROM some_table
WHERE condition
GROUP BY column1
) AS dt
WHERE dt.column2 > 10;
The key requirements:
- The subquery is wrapped in parentheses.
- It must have an alias (
AS dtin this case). Most databases require this. Without it, the query fails. - Every column in the subquery's
SELECTlist should have a name (useASfor computed columns).
Example: Pre-Aggregating Data
One of the most common uses of derived tables is to compute aggregates first, then filter or join the results.
Find departments where the average salary exceeds $65,000:
You might try this directly:
-- This works, but uses HAVING
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 65000;
Using a derived table achieves the same result with a different structure:
SELECT
dept_stats.department_id,
dept_stats.avg_salary
FROM (
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
) AS dept_stats
WHERE dept_stats.avg_salary > 65000;
Output (both versions):
| department_id | avg_salary |
|---|---|
| 1 | 85000.00 |
| 2 | 70000.00 |
In this simple case, HAVING is more concise. But derived tables shine when the filtering logic is more complex or when you need to join aggregated data back to detail rows.
Example: Joining Aggregated Data to Detail Rows
This is where derived tables become indispensable. Suppose you want to display each employee alongside their department's average salary and headcount, without using correlated subqueries:
SELECT
e.employee_name,
e.salary,
d.department_name,
dept_stats.avg_salary,
dept_stats.headcount,
e.salary - dept_stats.avg_salary AS diff_from_avg
FROM employees AS e
JOIN departments AS d
ON e.department_id = d.department_id
JOIN (
SELECT
department_id,
ROUND(AVG(salary), 2) AS avg_salary,
COUNT(*) AS headcount
FROM employees
GROUP BY department_id
) AS dept_stats
ON e.department_id = dept_stats.department_id
ORDER BY d.department_name, e.salary DESC;
Output:
| employee_name | salary | department_name | avg_salary | headcount | diff_from_avg |
|---|---|---|---|---|---|
| Alice | 95000 | Engineering | 85000.00 | 3 | 10000.00 |
| Bob | 82000 | Engineering | 85000.00 | 3 | -3000.00 |
| Carol | 78000 | Engineering | 85000.00 | 3 | -7000.00 |
| Eve | 72000 | Marketing | 70000.00 | 2 | 2000.00 |
| Dave | 68000 | Marketing | 70000.00 | 2 | -2000.00 |
| Hank | 63000 | Sales | 61000.00 | 3 | 2000.00 |
| Frank | 61000 | Sales | 61000.00 | 3 | 0.00 |
| Grace | 59000 | Sales | 61000.00 | 3 | -2000.00 |
The derived table dept_stats computes the average and headcount once per department, then the outer query joins this summary back to individual employees. This is both cleaner and more performant than using multiple correlated subqueries in the SELECT list.
Compare this derived table approach with the correlated subquery approach from the previous section. Both produce the same result, but the derived table computes aggregates once per group instead of once per row, which is significantly more efficient on large datasets.
Example: Filtering on Computed Values
Derived tables let you filter on computed columns that you cannot reference in a WHERE clause directly.
Find products whose price is more than double the average price of their category:
SELECT
p.product_name,
p.price,
p.category,
cat_avg.avg_category_price,
ROUND(p.price / cat_avg.avg_category_price, 2) AS price_ratio
FROM products AS p
JOIN (
SELECT
category,
AVG(price) AS avg_category_price
FROM products
GROUP BY category
) AS cat_avg
ON p.category = cat_avg.category
WHERE p.price > cat_avg.avg_category_price * 2
ORDER BY price_ratio DESC;
Output:
| product_name | price | category | avg_category_price | price_ratio |
|---|---|---|---|---|
| Mechanical Keyboard | 89.99 | Electronics | 41.99 | 2.14 |
The Electronics average is (25.99 + 9.99 + 89.99) / 3 = 41.99. The Mechanical Keyboard at 89.99 is 89.99 / 41.99 = 2.14x the average.
Only the Mechanical Keyboard exceeds twice its category average. This kind of analysis would be awkward to express without a derived table.
Multiple Derived Tables in One Query
You can use multiple derived tables in a single query, joining them together or with regular tables:
SELECT
cust_stats.customer_name,
cust_stats.total_orders,
cust_stats.total_spent,
overall.avg_spent_per_customer,
cust_stats.total_spent - overall.avg_spent_per_customer AS diff_from_avg
FROM (
SELECT
c.customer_name,
COUNT(o.order_id) AS total_orders,
COALESCE(SUM(o.total_amount), 0) AS total_spent
FROM customers AS c
LEFT JOIN orders AS o ON c.customer_id = o.customer_id
GROUP BY c.customer_name
) AS cust_stats
CROSS JOIN (
SELECT AVG(total_spent) AS avg_spent_per_customer
FROM (
SELECT
c.customer_id,
COALESCE(SUM(o.total_amount), 0) AS total_spent
FROM customers AS c
LEFT JOIN orders AS o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
) AS per_customer
) AS overall
ORDER BY cust_stats.total_spent DESC;
Output:
| customer_name | total_orders | total_spent | avg_spent_per_customer | diff_from_avg |
|---|---|---|---|---|
| Acme Corp | 3 | 3800.00 | 1356.25 | 2443.75 |
| Gamma LLC | 2 | 1325.00 | 1356.25 | -31.25 |
| Beta Inc | 1 | 300.00 | 1356.25 | -1056.25 |
| Delta Partners | 0 | 0.00 | 1356.25 | -1356.25 |
This query uses three levels of derived tables: one for per-customer stats, one nested inside another for the overall average, and a CROSS JOIN to make the overall average available on every row.
When derived tables start nesting deeply, readability suffers. This is exactly the situation where Common Table Expressions (CTEs) shine. You can rewrite the above query with named CTEs to flatten the nesting and make each logical step independently readable.
Inline Views: Another Name for the Same Concept
You may encounter the term inline view in Oracle documentation and some SQL textbooks. An inline view is simply a subquery in the FROM clause. It is the same concept as a derived table under a different name.
| Term | Used By | Meaning |
|---|---|---|
| Derived table | SQL standard, MySQL, PostgreSQL, SQL Server | Subquery in FROM |
| Inline view | Oracle, academic texts | Subquery in FROM |
There is no behavioral difference. Both describe a subquery that the outer query treats as a virtual table.
-- Oracle documentation calls this an "inline view"
-- PostgreSQL/MySQL documentation calls it a "derived table"
-- They are identical
SELECT iv.department_id, iv.max_salary
FROM (
SELECT department_id, MAX(salary) AS max_salary
FROM employees
GROUP BY department_id
) iv; -- alias (Oracle allows omitting AS)
Derived Tables vs CTEs
Derived tables and CTEs solve many of the same problems. Here is a side-by-side comparison using the department stats example:
Derived table version:
SELECT
e.employee_name,
e.salary,
ds.avg_salary
FROM employees AS e
JOIN (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
) AS ds ON e.department_id = ds.department_id
WHERE e.salary > ds.avg_salary;
CTE version:
WITH dept_stats AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
)
SELECT
e.employee_name,
e.salary,
ds.avg_salary
FROM employees AS e
JOIN dept_stats AS ds ON e.department_id = ds.department_id
WHERE e.salary > ds.avg_salary;
Both produce the same output:
| employee_name | salary | avg_salary |
|---|---|---|
| Alice | 95000 | 85000.00 |
| Eve | 72000 | 70000.00 |
| Hank | 63000 | 61000.00 |
So when should you choose one over the other?
| Aspect | Derived Table | CTE |
|---|---|---|
| Defined | Inside FROM clause, inline | Before the main query, using WITH |
| Reusability within the query | Must be repeated if needed in multiple places | Can be referenced multiple times |
| Nesting | Can nest deeply, reducing readability | Flat structure, each CTE is named |
| Recursive queries | Not possible | Supported |
| Database support | Universal | Most modern databases (not very old MySQL versions) |
Rule of thumb: Use a derived table for simple, one-off transformations. Switch to a CTE when the subquery is complex, referenced more than once, or when you need multiple logical steps.
Combining Scalar Subqueries in SELECT with Derived Tables in FROM
These two techniques are not mutually exclusive. You can use both in the same query to build layered computations:
SELECT
dept_summary.department_name,
dept_summary.avg_salary,
dept_summary.headcount,
(SELECT SUM(salary) FROM employees) AS company_total_salary,
ROUND(
dept_summary.avg_salary / (SELECT AVG(salary) FROM employees) * 100,
1
) AS pct_of_company_avg
FROM (
SELECT
d.department_name,
AVG(e.salary) AS avg_salary,
COUNT(*) AS headcount
FROM departments AS d
JOIN employees AS e ON d.department_id = e.department_id
GROUP BY d.department_name
) AS dept_summary
ORDER BY dept_summary.avg_salary DESC;
Output:
| department_name | avg_salary | headcount | company_total_salary | pct_of_company_avg |
|---|---|---|---|---|
| Engineering | 85000.00 | 3 | 578000 | 117.6 |
| Marketing | 70000.00 | 2 | 578000 | 96.8 |
| Sales | 61000.00 | 3 | 578000 | 84.4 |
The derived table handles the per-department aggregation. The scalar subqueries in the SELECT list inject company-wide totals and averages for context.
Common Mistakes
Mistake 1: Forgetting the Alias on a Derived Table
Most databases require a derived table to have an alias.
Wrong:
SELECT department_id, avg_salary
FROM (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
);
-- ERROR: Every derived table must have its own alias (MySQL)
-- ERROR: subquery in FROM must have an alias (PostgreSQL)
Correct:
SELECT department_id, avg_salary
FROM (
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
) AS dept_stats; -- Alias is required
Mistake 2: Referencing Outer Columns in a Non-Correlated Derived Table
A derived table in FROM is evaluated before the outer query. It cannot reference outer query columns (except in databases that support LATERAL joins, discussed below).
Wrong (in most databases):
SELECT
e.employee_name,
top_earner.max_salary
FROM employees AS e
JOIN (
SELECT MAX(salary) AS max_salary
FROM employees AS e2
WHERE e2.department_id = e.department_id -- ERROR: "e" not recognized
) AS top_earner ON TRUE;
Correct approach using a regular correlated subquery:
SELECT
e.employee_name,
(
SELECT MAX(e2.salary)
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS max_dept_salary
FROM employees AS e;
Or using a non-correlated derived table with a join:
SELECT
e.employee_name,
dept_max.max_salary
FROM employees AS e
JOIN (
SELECT department_id, MAX(salary) AS max_salary
FROM employees
GROUP BY department_id
) AS dept_max ON e.department_id = dept_max.department_id;
PostgreSQL and MySQL 8.0.14+ support LATERAL joins, which allow a derived table to reference columns from preceding tables in the FROM clause. This is an advanced feature that bridges the gap between derived tables and correlated subqueries:
-- PostgreSQL / MySQL 8.0.14+
SELECT
e.employee_name,
top.max_salary
FROM employees AS e
JOIN LATERAL (
SELECT MAX(e2.salary) AS max_salary
FROM employees AS e2
WHERE e2.department_id = e.department_id
) AS top ON TRUE;
In SQL Server, the equivalent feature is CROSS APPLY / OUTER APPLY.
Mistake 3: Unnamed Computed Columns in Derived Tables
If a computed column inside the derived table has no alias, you cannot reference it in the outer query.
Wrong:
SELECT department_id, avg_sal
FROM (
SELECT department_id, AVG(salary) -- No alias for AVG(salary)
FROM employees
GROUP BY department_id
) AS ds;
-- ERROR: column "avg_sal" does not exist
Correct:
SELECT department_id, avg_sal
FROM (
SELECT department_id, AVG(salary) AS avg_sal -- Named!
FROM employees
GROUP BY department_id
) AS ds;
Always name every computed column in a derived table's SELECT list. Some databases assign auto-generated names like avg(salary) or _col1, but relying on these makes your query fragile and non-portable.
Performance Considerations
Scalar Subqueries in SELECT
- Non-correlated scalar subqueries in
SELECTare evaluated once. They are essentially free in terms of performance. - Correlated scalar subqueries in
SELECTare logically evaluated once per row. For a table with N rows, the subquery runs N times. On large tables, this can be slow. - Optimization: If you have multiple correlated scalar subqueries referencing the same inner table and correlated column, consider replacing them with a single derived table join, which computes all needed values in one pass.
Slow (three correlated subqueries, three passes):
SELECT
e.employee_name,
(SELECT AVG(e2.salary) FROM employees e2 WHERE e2.department_id = e.department_id) AS avg_sal,
(SELECT MAX(e2.salary) FROM employees e2 WHERE e2.department_id = e.department_id) AS max_sal,
(SELECT COUNT(*) FROM employees e2 WHERE e2.department_id = e.department_id) AS headcount
FROM employees AS e;
Faster (one derived table, one pass):
SELECT
e.employee_name,
ds.avg_sal,
ds.max_sal,
ds.headcount
FROM employees AS e
JOIN (
SELECT
department_id,
AVG(salary) AS avg_sal,
MAX(salary) AS max_sal,
COUNT(*) AS headcount
FROM employees
GROUP BY department_id
) AS ds ON e.department_id = ds.department_id;
Both produce identical results, but the second version scans the employees table once for the aggregation instead of three times.
Derived Tables in FROM
- The database materializes the derived table result before the outer query runs. For small result sets, this is fast. For very large intermediate results, it consumes memory.
- Most databases create an internal temporary table for the derived table. Some optimizers can "merge" simple derived tables into the outer query, avoiding materialization entirely.
- Index usage: The derived table result has no indexes. If the outer query needs to filter or join on derived table columns, performance can degrade. In such cases, a CTE or a temporary table with explicit indexes might be better.
-- Use EXPLAIN to check how your database handles the derived table
EXPLAIN ANALYZE
SELECT e.employee_name, ds.avg_sal
FROM employees AS e
JOIN (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
) AS ds ON e.department_id = ds.department_id;
Quick Reference
| Placement | Returns | Use Case | Alias Required? |
|---|---|---|---|
Scalar subquery in SELECT | One row, one column | Computed column per row | Yes (for the column) |
Derived table in FROM | Multiple rows and columns | Pre-aggregation, intermediate result sets | Yes (for the table) |
Correlated scalar in SELECT | One row, one column (varies per row) | Per-row lookups, counts, rankings | Yes (for the column) |
Summary
An SQL subquery in SELECT produces a computed column that is calculated for every row in the result set. When non-correlated, it injects a constant value like an overall average or total. When correlated, it computes a row-specific value like a department average or a count of related records. The strict rule is that it must return exactly one row and one column.
A subquery in the FROM clause (a derived table or inline view) creates a temporary result set that the outer query treats as a regular table. It is ideal for pre-aggregating data, computing intermediate values, and then joining those results back to detail rows. Every derived table must have an alias, and every computed column inside it should be explicitly named.
Key takeaways:
- Use scalar subqueries in
SELECTwhen you need a single computed value per row, such as a global stat for comparison or a correlated count/average. - Use derived tables in
FROMwhen you need to pre-aggregate or transform data before the outer query processes it. - Prefer derived tables over multiple correlated subqueries when you need several aggregated values from the same group, as derived tables compute everything in a single pass.
- Always provide aliases for both derived tables and computed columns.
- When derived tables nest deeply, switch to CTEs for better readability.
- Use
EXPLAINto verify that the database handles your subqueries efficiently, especially correlated ones on large tables.
Mastering subqueries in SELECT and FROM gives you the building blocks to write queries that answer complex analytical questions in a single, self-contained SQL statement.