Recursive Common Table Expressions (CTEs) in SQL
Some data structures are inherently hierarchical: organizational charts where employees report to managers, category trees where subcategories nest inside parent categories, threaded comment systems, file system directories, and bill-of-materials breakdowns. Querying this kind of data with standard SQL is extremely awkward. You would need to write a separate JOIN for every possible level of depth, and if the depth is unknown, you are stuck.
SQL recursive CTEs solve this problem elegantly. A recursive Common Table Expression calls itself repeatedly, walking through hierarchical relationships level by level until there is nothing left to traverse. This guide covers everything you need to know: the syntax, how recursion works step by step, how to build org charts and category trees, how to set stopping conditions, and the common mistakes that can crash your database with infinite loops.
Prerequisites
You should be comfortable with regular (non-recursive) CTEs and the WITH clause. If the basic CTE syntax is unfamiliar, review a guide on Common Table Expressions first.
Recursive CTEs are supported by PostgreSQL, MySQL 8+, SQL Server, Oracle, SQLite 3.8.3+, and MariaDB 10.2.1+.
What Makes a CTE Recursive?
A regular CTE runs its query once and produces a result. A recursive CTE runs its query repeatedly, feeding each iteration's output back as input to the next iteration, until no new rows are produced. This self-referencing behavior is what makes it "recursive."
The key signal in syntax is the RECURSIVE keyword (required in PostgreSQL, MySQL, and SQLite; optional in SQL Server and Oracle) and the fact that the CTE references itself inside its own definition.
Syntax for Recursion
Structure
WITH RECURSIVE cte_name AS (
-- Anchor member: the starting point (non-recursive)
SELECT ...
FROM some_table
WHERE starting_condition
UNION ALL
-- Recursive member: references the CTE itself
SELECT ...
FROM some_table
JOIN cte_name ON relationship_condition
)
SELECT * FROM cte_name;
A recursive CTE has three essential parts:
| Part | Purpose | Runs |
|---|---|---|
| Anchor member | Defines the starting rows (the "root" of the recursion) | Once |
UNION ALL | Combines anchor results with each recursive iteration | Every iteration |
| Recursive member | References the CTE itself to find the next level of rows | Repeatedly until empty |
How It Works Step by Step
- The anchor member executes and produces the initial set of rows.
- The recursive member executes, using the rows from the previous iteration as its input.
- The new rows from step 2 are added to the result and also become the input for the next iteration.
- Step 2 repeats until the recursive member returns zero rows.
- All accumulated rows from every iteration are combined and returned as the final result.
Think of it like peeling an onion layer by layer. The anchor gives you the outermost layer. Each recursive step peels one more layer. When there are no more layers, the process stops.
The Sample Data: Employee Hierarchy
We will use an employees table that models an organizational chart. Each employee has a manager_id that references another employee's id. The CEO has manager_id = NULL because they report to no one.
employees table:
| id | name | title | manager_id |
|---|---|---|---|
| 1 | Sarah | CEO | NULL |
| 2 | James | VP Engineering | 1 |
| 3 | Maria | VP Sales | 1 |
| 4 | Robert | Engineering Manager | 2 |
| 5 | Linda | Sales Manager | 3 |
| 6 | David | Senior Developer | 4 |
| 7 | Jennifer | Developer | 4 |
| 8 | Michael | Sales Rep | 5 |
| 9 | Lisa | Sales Rep | 5 |
| 10 | Kevin | Junior Developer | 6 |
This represents the following org chart:
Sarah (CEO)
├── James (VP Engineering)
│ └── Robert (Engineering Manager)
│ ├── David (Senior Developer)
│ │ └── Kevin (Junior Developer)
│ └── Jennifer (Developer)
└── Maria (VP Sales)
└── Linda (Sales Manager)
├── Michael (Sales Rep)
└── Lisa (Sales Rep)
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
title VARCHAR(50) NOT NULL,
manager_id INT NULL,
CONSTRAINT fk_manager
FOREIGN KEY (manager_id) REFERENCES employees(id)
);
INSERT INTO employees (id, name, title, manager_id) VALUES
(1, 'Sarah', 'CEO', NULL),
(2, 'James', 'VP Engineering', 1),
(3, 'Maria', 'VP Sales', 1),
(4, 'Robert', 'Engineering Manager', 2),
(5, 'Linda', 'Sales Manager', 3),
(6, 'David', 'Senior Developer', 4),
(7, 'Jennifer', 'Developer', 4),
(8, 'Michael', 'Sales Rep', 5),
(9, 'Lisa', 'Sales Rep', 5),
(10, 'Kevin', 'Junior Developer', 6);
Building an Org Chart with a Recursive CTE
Finding All Employees Under a Manager
Let's find every employee who reports to Sarah (the CEO), directly or indirectly:
WITH RECURSIVE org_chart AS (
SELECT
id,
name,
title,
manager_id,
1 AS level
FROM employees
WHERE name = 'Sarah'
UNION ALL
SELECT
e.id,
e.name,
e.title,
e.manager_id,
oc.level + 1 AS level
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT
level,
CONCAT(REPEAT(' ', (level - 1) * 4), name) AS org_tree,
title
FROM org_chart
ORDER BY level, name;
Output:
| level | org_tree | title |
|---|---|---|
| 1 | Sarah | CEO |
| 2 | James | VP Engineering |
| 2 | Maria | VP Sales |
| 3 | Linda | Sales Manager |
| 3 | Robert | Engineering Manager |
| 4 | David | Senior Developer |
| 4 | Jennifer | Developer |
| 4 | Lisa | Sales Rep |
| 4 | Michael | Sales Rep |
| 5 | Kevin | Junior Developer |
Let's trace through each iteration:
- Iteration 1 (anchor): Finds Sarah (level 1). Result: 1 row.
- Iteration 2 (recursive): Finds employees where
manager_id = Sarah's id (1). That is James and Maria (level 2). Result: 2 rows. - Iteration 3: Finds employees where
manager_id IN (2, 3). That is Robert and Linda (level 3). Result: 2 rows. - Iteration 4: Finds employees where
manager_id IN (4, 5). That is David, Jennifer, Michael, and Lisa (level 4). Result: 4 rows. - Iteration 5: Finds employees where
manager_id IN (6, 7, 8, 9). Only Kevin hasmanager_id = 6(level 5). Result: 1 row. - Iteration 6: Finds employees where
manager_id = 10. Nobody. Result: 0 rows. Recursion stops.
The query above works in PostgreSQL: MySQL treats || as a logical OR operator, not as string concatenation. So replace || with MySQL's CONCAT() function.
Starting from a Different Manager
You can start the org chart from any manager by simply changing the anchor condition:
WITH RECURSIVE team AS (
-- Anchor: Start with Robert
SELECT id, name, title, manager_id, 1 AS level
FROM employees
WHERE name = 'Robert'
UNION ALL
-- Recursive: Find Robert's direct and indirect reports
SELECT e.id, e.name, e.title, e.manager_id, t.level + 1
FROM employees e
JOIN team t ON e.manager_id = t.id
)
SELECT level, name, title
FROM team
ORDER BY level, name;
Output:
| level | name | title |
|---|---|---|
| 1 | Robert | Engineering Manager |
| 2 | David | Senior Developer |
| 2 | Jennifer | Developer |
| 3 | Kevin | Junior Developer |
Only Robert's subtree is traversed.
Building a Full Path
A common requirement is to show the full management chain for each employee, from the CEO down to their position. You can do this by concatenating names at each recursive step:
WITH RECURSIVE org_path AS (
SELECT
id,
name,
title,
manager_id,
CAST(name AS VARCHAR(1000)) AS full_path,
1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.id,
e.name,
e.title,
e.manager_id,
CAST(op.full_path || ' > ' || e.name AS VARCHAR(1000)) AS full_path,
op.level + 1
FROM employees e
JOIN org_path op ON e.manager_id = op.id
)
SELECT name, title, full_path
FROM org_path
ORDER BY full_path;
Output:
| name | title | full_path |
|---|---|---|
| Sarah | CEO | Sarah |
| James | VP Engineering | Sarah > James |
| Robert | Engineering Manager | Sarah > James > Robert |
| David | Senior Developer | Sarah > James > Robert > David |
| Kevin | Junior Developer | Sarah > James > Robert > David > Kevin |
| Jennifer | Developer | Sarah > James > Robert > Jennifer |
| Maria | VP Sales | Sarah > Maria |
| Linda | Sales Manager | Sarah > Maria > Linda |
| Lisa | Sales Rep | Sarah > Maria > Linda > Lisa |
| Michael | Sales Rep | Sarah > Maria > Linda > Michael |
Every employee now carries their complete organizational path. This is extremely useful for reporting, breadcrumb navigation, and understanding reporting lines.
Hierarchical Data: Category Trees
Recursive CTEs are not limited to org charts. Any self-referencing table benefits from them. Let's look at a product category tree.
categories table:
| id | name | parent_id |
|---|---|---|
| 1 | Electronics | NULL |
| 2 | Clothing | NULL |
| 3 | Computers | 1 |
| 4 | Phones | 1 |
| 5 | Men's Wear | 2 |
| 6 | Women's Wear | 2 |
| 7 | Laptops | 3 |
| 8 | Desktops | 3 |
| 9 | Smartphones | 4 |
| 10 | Feature Phones | 4 |
| 11 | Gaming Laptops | 7 |
| 12 | Business Laptops | 7 |
CREATE TABLE categories (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
parent_id INT NULL,
CONSTRAINT fk_parent
FOREIGN KEY (parent_id) REFERENCES categories(id)
);
INSERT INTO categories (id, name, parent_id) VALUES
(1, 'Electronics', NULL),
(2, 'Clothing', NULL),
(3, 'Computers', 1),
(4, 'Phones', 1),
(5, 'Men''s Wear', 2),
(6, 'Women''s Wear', 2),
(7, 'Laptops', 3),
(8, 'Desktops', 3),
(9, 'Smartphones', 4),
(10, 'Feature Phones', 4),
(11, 'Gaming Laptops', 7),
(12, 'Business Laptops', 7);
Expanding the Full Category Tree
WITH RECURSIVE category_tree AS (
SELECT
id,
name,
parent_id,
CAST(name AS VARCHAR(1000)) AS full_path,
1 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
SELECT
c.id,
c.name,
c.parent_id,
CAST(ct.full_path || ' / ' || c.name AS VARCHAR(1000)) AS full_path,
ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT
LPAD('', (depth - 1) * 3) || name AS category,
full_path,
depth
FROM category_tree
ORDER BY full_path;
Output:
| category | full_path | depth |
|---|---|---|
| Clothing | Clothing | 1 |
| Men's Wear | Clothing / Men's Wear | 2 |
| Women's Wear | Clothing / Women's Wear | 2 |
| Electronics | Electronics | 1 |
| Computers | Electronics / Computers | 2 |
| Desktops | Electronics / Computers / Desktops | 3 |
| Laptops | Electronics / Computers / Laptops | 3 |
| Business Laptops | Electronics / Computers / Laptops / Business Laptops | 4 |
| Gaming Laptops | Electronics / Computers / Laptops / Gaming Laptops | 4 |
| Phones | Electronics / Phones | 2 |
| Feature Phones | Electronics / Phones / Feature Phones | 3 |
| Smartphones | Electronics / Phones / Smartphones | 3 |
Finding All Ancestors of a Category
Sometimes you need to go upward instead of downward. For example, given "Gaming Laptops," find all parent categories up to the root:
WITH RECURSIVE ancestors AS (
-- Anchor: Start with the target category
SELECT id, name, parent_id, 1 AS level
FROM categories
WHERE name = 'Gaming Laptops'
UNION ALL
-- Recursive: Find each parent
SELECT c.id, c.name, c.parent_id, a.level + 1
FROM categories c
JOIN ancestors a ON c.id = a.parent_id
)
SELECT name, level
FROM ancestors
ORDER BY level DESC;
Output:
| name | level |
|---|---|
| Electronics | 4 |
| Computers | 3 |
| Laptops | 2 |
| Gaming Laptops | 1 |
The recursion walks upward through parent_id instead of downward through children. This is useful for building breadcrumb navigation: Electronics > Computers > Laptops > Gaming Laptops.
- Top-down (find all descendants): The recursive member joins
child.parent_id = cte.id - Bottom-up (find all ancestors): The recursive member joins
parent.id = cte.parent_id
The anchor determines your starting point, and the join condition determines the direction.
Stopping Conditions
Recursion must stop at some point. Without proper stopping conditions, a recursive CTE will run indefinitely (or until the database hits a safety limit and throws an error). There are several ways to control termination.
Natural Termination
The most common and safest stopping mechanism is natural termination: the recursive member simply returns zero rows because there are no more matching records.
In our org chart example, when Kevin (id=10) is processed, the recursive member looks for employees with manager_id = 10. Nobody has that, so zero rows are returned and recursion stops naturally.
Depth Limiting with a Level Counter
You can add an explicit depth limit to prevent the recursion from going too deep:
WITH RECURSIVE limited_tree AS (
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, lt.level + 1
FROM employees e
JOIN limited_tree lt ON e.manager_id = lt.id
WHERE lt.level < 3 -- Stop after 3 levels
)
SELECT level, name
FROM limited_tree
ORDER BY level, name;
Output:
| level | name |
|---|---|
| 1 | Sarah |
| 2 | James |
| 2 | Maria |
| 3 | Linda |
| 3 | Robert |
The recursion stops at level 3 even though deeper levels exist. The WHERE lt.level < 3 condition in the recursive member prevents it from generating level 4 rows.
Database Safety Limits
Most databases have built-in safety limits to prevent runaway recursion:
| Database | Default Limit | How to Change |
|---|---|---|
| SQL Server | 100 iterations | OPTION (MAXRECURSION N) at the end of the query. Use 0 for unlimited. |
| PostgreSQL | No hard limit (memory-based) | SET statement_timeout to limit execution time |
| MySQL 8+ | 1000 iterations | SET cte_max_recursion_depth = N; |
| SQLite | 1000 iterations | PRAGMA recursive_triggers; or recompile with custom limit |
-- SQL Server (does not use the RECURSIVE keyword)
WITH deep_tree AS (
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, dt.level + 1
FROM employees e
JOIN deep_tree dt ON e.manager_id = dt.id
)
SELECT * FROM deep_tree
OPTION (MAXRECURSION 500);
-- MySQL: Increase the recursion limit
SET cte_max_recursion_depth = 5000;
Generating Sequences with Recursive CTEs
Recursive CTEs are not only for hierarchical data. They can also generate sequences of numbers or dates, which is extremely useful for filling gaps in time-series data or creating calendar tables.
Generating a Number Sequence
WITH RECURSIVE numbers AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1
FROM numbers
WHERE n < 10
)
SELECT n FROM numbers;
Output:
| n |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
The anchor starts at 1. Each recursive step adds 1. The WHERE n < 10 condition stops the recursion after reaching 10.
Generating a Date Series
-- Generate date series from 2024-01-01 to 2024-01-07
WITH RECURSIVE date_series AS (
-- Anchor row
SELECT DATE('2024-01-01') AS dt
UNION ALL
-- Recursive step
SELECT dt + INTERVAL 1 DAY
FROM date_series
WHERE dt < DATE('2024-01-07')
)
SELECT dt
FROM date_series
ORDER BY dt;
Output:
| dt |
|---|
| 2024-01-01 |
| 2024-01-02 |
| 2024-01-03 |
| 2024-01-04 |
| 2024-01-05 |
| 2024-01-06 |
| 2024-01-07 |
PostgreSQL has a built-in generate_series() function that is more efficient for this purpose. However, the recursive CTE approach works across all databases, making it the portable solution.
Filling Gaps in Time-Series Data
Combine a generated date series with a LEFT JOIN to ensure every date appears, even dates with no data. The following example assumes an orders table with order_date and amount columns:
WITH RECURSIVE all_dates AS (
-- Anchor date
SELECT DATE('2024-01-01') AS dt
UNION ALL
-- Recursive step
SELECT dt + INTERVAL 1 DAY
FROM all_dates
WHERE dt < DATE('2024-01-10')
),
daily_totals AS (
SELECT
DATE(order_date) AS order_day,
SUM(amount) AS total_amount
FROM orders
GROUP BY DATE(order_date)
)
SELECT
ad.dt AS order_day,
COALESCE(d.total_amount, 0) AS total_amount
FROM all_dates ad
LEFT JOIN daily_totals d
ON ad.dt = d.order_day
ORDER BY ad.dt;
This ensures every day from January 1 to January 10 appears in the result, with 0 for days that had no orders.
UNION ALL vs UNION in Recursive CTEs
You may wonder whether you can use UNION (which removes duplicates) instead of UNION ALL in a recursive CTE.
UNION ALL (Standard and Required in Most Cases)
UNION ALL keeps all rows including duplicates. This is the standard and required form in most databases:
WITH RECURSIVE cte AS (
SELECT ... -- Anchor
UNION ALL
SELECT ... -- Recursive
FROM cte ...
)
UNION (Duplicate Elimination)
Some databases (PostgreSQL, SQLite) allow UNION in recursive CTEs. This eliminates duplicate rows at each iteration, which can serve as a natural stopping condition for graphs with cycles:
-- PostgreSQL: UNION prevents revisiting the same node
WITH RECURSIVE cte AS (
SELECT ...
UNION -- Deduplicates at each step
SELECT ...
FROM cte ...
)
- SQL Server and MySQL do not support
UNION(withoutALL) in recursive CTEs. They will throw a syntax error. - Even in databases that support it,
UNIONadds overhead because it must check for duplicates at every iteration. - Prefer
UNION ALLwith explicit cycle detection (shown below) for portability and clarity.
Preventing Infinite Loops
When your data has cycles (e.g., employee A reports to B, and B reports to A), a recursive CTE will loop forever unless you add cycle detection. Even if you believe your data is clean, defensive coding prevents catastrophic runaway queries.
Cycle Detection with a Path
Track the visited nodes by building a path string and checking for duplicates:
WITH RECURSIVE safe_tree AS (
SELECT
id,
name,
manager_id,
ARRAY[id] AS visited_ids, -- PostgreSQL array syntax
FALSE AS has_cycle,
1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.id,
e.name,
e.manager_id,
st.visited_ids || e.id,
e.id = ANY(st.visited_ids) AS has_cycle,
st.level + 1
FROM employees e
JOIN safe_tree st ON e.manager_id = st.id
WHERE NOT e.id = ANY(st.visited_ids) -- Stop if cycle detected
)
SELECT level, name, visited_ids
FROM safe_tree
ORDER BY level, name;
The visited_ids array accumulates every node's id along the path. Before adding a new row, the WHERE NOT e.id = ANY(st.visited_ids) check ensures the node has not been visited already. If it has, the row is excluded, breaking the cycle.
Portable Cycle Detection with a Path String
If your database does not support arrays, use a string-based path:
WITH RECURSIVE safe_tree AS (
SELECT
id,
name,
manager_id,
CAST('/' || CAST(id AS VARCHAR(10)) || '/' AS VARCHAR(1000)) AS path,
1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.id,
e.name,
e.manager_id,
CAST(st.path || CAST(e.id AS VARCHAR(10)) || '/' AS VARCHAR(1000)),
st.level + 1
FROM employees e
JOIN safe_tree st ON e.manager_id = st.id
WHERE st.path NOT LIKE '%/' || CAST(e.id AS VARCHAR(10)) || '/%'
)
SELECT level, name, path
FROM safe_tree
ORDER BY level, name;
The path string /1/2/4/6/ is checked with NOT LIKE to see if the new node's id is already present. This works across most databases that support || for concatenation.
Combined Safety: Depth Limit + Cycle Detection
For maximum safety, combine both approaches:
WITH RECURSIVE safe_tree AS (
SELECT
id, name, manager_id,
CAST('/' || CAST(id AS VARCHAR(10)) || '/' AS VARCHAR(1000)) AS path,
1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT
e.id, e.name, e.manager_id,
CAST(st.path || CAST(e.id AS VARCHAR(10)) || '/' AS VARCHAR(1000)),
st.level + 1
FROM employees e
JOIN safe_tree st ON e.manager_id = st.id
WHERE st.level < 20 -- Depth limit
AND st.path NOT LIKE '%/' || CAST(e.id AS VARCHAR(10)) || '/%' -- Cycle check
)
SELECT * FROM safe_tree;
Always include at least one stopping safeguard in production recursive CTEs. Even if your data is cycle-free today, a future data entry error could introduce a cycle and bring down your query (and possibly your database).
Common Mistakes to Avoid
Mistake 1: Forgetting the RECURSIVE Keyword
In PostgreSQL, MySQL, and SQLite, the RECURSIVE keyword is required:
-- WRONG in PostgreSQL/MySQL: Missing RECURSIVE
WITH org_chart AS (
SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id
FROM employees e JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;
Error: recursive reference to query "org_chart" is not allowed in this context
Fix:
-- CORRECT: Add RECURSIVE
WITH RECURSIVE org_chart AS (
...
)
SELECT * FROM org_chart;
In SQL Server, the RECURSIVE keyword is not used. SQL Server detects recursion automatically when a CTE references itself. In Oracle, the RECURSIVE keyword is also not used; Oracle has its own syntax conventions.
Mistake 2: Using UNION Instead of UNION ALL on Unsupported Databases
-- WRONG in SQL Server and MySQL
WITH RECURSIVE cte AS (
SELECT 1 AS n
UNION -- Not supported in recursive CTEs
SELECT n + 1 FROM cte WHERE n < 5
)
SELECT * FROM cte;
Error (MySQL): Recursive Common Table Expression 'cte' should contain a UNION ALL
Fix: Use UNION ALL:
WITH RECURSIVE cte AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM cte WHERE n < 5
)
SELECT * FROM cte;
Mistake 3: Missing the Stopping Condition
-- DANGEROUS: No stopping condition!
WITH RECURSIVE infinite AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM infinite -- Runs forever
)
SELECT * FROM infinite;
This will generate numbers 1, 2, 3, ... until the database either runs out of memory or hits its recursion limit.
Fix: Add a WHERE clause to the recursive member:
WITH RECURSIVE safe AS (
SELECT 1 AS n
UNION ALL
SELECT n + 1 FROM safe WHERE n < 100 -- Stops at 100
)
SELECT * FROM safe;
Mistake 4: Aggregate Functions in the Recursive Member
Most databases do not allow aggregate functions (SUM, COUNT, AVG, etc.) in the recursive part of a CTE:
-- WRONG: Aggregate in recursive member
WITH RECURSIVE bad_cte AS (
SELECT 1 AS level, 100 AS total
UNION ALL
SELECT level + 1, SUM(total) -- Not allowed
FROM bad_cte
WHERE level < 5
)
SELECT * FROM bad_cte;
Error: aggregate functions are not allowed in a recursive query's recursive term
If you need aggregation, compute it outside the recursive CTE:
WITH RECURSIVE tree AS (
...
)
SELECT level, COUNT(*), SUM(some_column)
FROM tree
GROUP BY level;
Mistake 5: Multiple References to the Recursive CTE in the Recursive Member
The recursive member can reference the CTE only once. Self-joining the CTE in the recursive member is not allowed in most databases:
-- WRONG: Two references to the same recursive CTE
WITH RECURSIVE bad AS (
SELECT 1 AS id
UNION ALL
SELECT b1.id + b2.id
FROM bad b1
JOIN bad b2 ON b1.id = b2.id -- Not allowed
)
SELECT * FROM bad;
Error: recursive reference to query must not appear more than once
Real-World Pattern: Bill of Materials
A bill of materials (BOM) describes how a finished product is assembled from components, which may themselves be assembled from sub-components. This is classic recursive data.
parts table:
| id | part_name | parent_id | quantity | unit_cost |
|---|---|---|---|---|
| 1 | Bicycle | NULL | 1 | 0 |
| 2 | Frame | 1 | 1 | 150 |
| 3 | Wheel Assembly | 1 | 2 | 0 |
| 4 | Tire | 3 | 1 | 25 |
| 5 | Rim | 3 | 1 | 35 |
| 6 | Spokes | 3 | 36 | 0.50 |
| 7 | Handlebar | 1 | 1 | 40 |
CREATE TABLE parts (
id INT PRIMARY KEY,
part_name VARCHAR(50) NOT NULL,
parent_id INT NULL,
quantity INT NOT NULL,
unit_cost DECIMAL(10,2) NOT NULL,
CONSTRAINT fk_part_parent
FOREIGN KEY (parent_id) REFERENCES parts(id)
);
INSERT INTO parts (id, part_name, parent_id, quantity, unit_cost) VALUES
(1, 'Bicycle', NULL, 1, 0),
(2, 'Frame', 1, 1, 150),
(3, 'Wheel Assembly', 1, 2, 0),
(4, 'Tire', 3, 1, 25),
(5, 'Rim', 3, 1, 35),
(6, 'Spokes', 3, 36, 0.50),
(7, 'Handlebar', 1, 1, 40);
WITH RECURSIVE bom AS (
SELECT
id,
part_name,
parent_id,
quantity,
unit_cost,
quantity AS total_qty,
1 AS depth
FROM parts
WHERE parent_id IS NULL
UNION ALL
SELECT
p.id,
p.part_name,
p.parent_id,
p.quantity,
p.unit_cost,
p.quantity * b.total_qty AS total_qty,
b.depth + 1
FROM parts p
JOIN bom b ON p.parent_id = b.id
)
SELECT
CONCAT(LPAD('', (depth - 1) * 3, ' '), part_name) AS component,
total_qty,
unit_cost,
total_qty * unit_cost AS line_cost
FROM bom
ORDER BY depth, part_name;
Output:
| component | total_qty | unit_cost | line_cost |
|---|---|---|---|
| Bicycle | 1 | 0.00 | 0.00 |
| Frame | 1 | 150.00 | 150.00 |
| Handlebar | 1 | 40.00 | 40.00 |
| Wheel Assembly | 2 | 0.00 | 0.00 |
| Rim | 2 | 35.00 | 70.00 |
| Spokes | 72 | 0.50 | 36.00 |
| Tire | 2 | 25.00 | 50.00 |
Notice how total_qty for Spokes is 72 (36 spokes per wheel assembly × 2 wheel assemblies). The recursive multiplication p.quantity * b.total_qty propagates quantities through the hierarchy correctly.
Performance Considerations
- Index the self-referencing foreign key. In the
employeestable, create an index onmanager_id. Incategories, indexparent_id. Without this, each recursive step performs a full table scan.
CREATE INDEX idx_employees_manager ON employees(manager_id);
CREATE INDEX idx_categories_parent ON categories(parent_id);
-
Keep recursion shallow when possible. Each level of recursion is essentially another join operation. Hierarchies with 5 to 10 levels are typical and efficient. Hierarchies with hundreds of levels may need optimization.
-
Always set a maximum depth. Even with clean data, a depth limit acts as a safety net against unexpected cycles or abnormally deep hierarchies.
-
Avoid heavy computation in the recursive member. Keep the recursive part lean. Do complex calculations (aggregations, string operations, etc.) in the outer query that reads from the completed CTE.
-
Consider materialized path or nested set models for read-heavy workloads. If you query the same hierarchy thousands of times per second, a recursive CTE on every request may be too slow. Precomputing paths or using specialized hierarchical storage models can be more efficient for extreme read performance.
Summary
SQL recursive CTEs unlock the ability to query hierarchical and graph-structured data that would be impossible with standard SQL:
- A recursive CTE has three parts: the anchor member (starting point),
UNION ALL, and the recursive member (self-referencing step). - The recursion runs iteratively, not all at once. Each iteration feeds its output to the next until zero rows are returned.
- Common use cases include org charts, category trees, bill of materials, graph traversal, and sequence generation.
- Stopping conditions are critical. Use natural termination (no more matching rows), explicit depth limits (
WHERE level < N), or cycle detection (tracking visited nodes in a path). - Most databases require
UNION ALLin recursive CTEs. Only PostgreSQL and SQLite supportUNIONfor built-in deduplication. - The
RECURSIVEkeyword is required in PostgreSQL, MySQL, and SQLite. SQL Server and Oracle detect recursion automatically. - Always index the self-referencing column (
manager_id,parent_id) for performance.
Think of a recursive CTE as a breadth-first search. The anchor gives you the root nodes. Each recursive iteration explores one level deeper. When there are no more nodes to explore, the search is complete, and all discovered nodes are returned together.