SQL Self Join for Joining a Table to Itself
Every join you have written so far combines two different tables: customers with orders, products with categories, order items with products. But what happens when the relationship you need to explore exists within a single table? An employee reports to a manager, and that manager is also a row in the same employees table. A product is a variation of another product in the same products table. A comment is a reply to another comment in the same comments table.
These are self-referencing relationships, and querying them requires a technique called a self join: joining a table to itself. The table appears twice in the query, each time with a different alias, allowing you to treat it as if it were two separate tables.
Self joins are not a new join type. They use the same INNER JOIN, LEFT JOIN, or any other join you already know. The only difference is that both sides of the join reference the same table. This guide covers why self joins exist, how to write them, common use cases including employee-manager hierarchies and product comparisons, and the aliasing techniques that make them readable. Every example includes full outputs you can verify in your own environment.
Every example uses the ShopSmart sample database with full outputs (we defined it in a previous guide here, and then we extended it here).
Why Would You Join a Table to Itself?
The need for self joins arises whenever a row in a table references another row in the same table. This happens more often than you might expect:
| Scenario | The Relationship |
|---|---|
| Employees and managers | Each employee's manager is also an employee |
| Comments and replies | A reply references the parent comment in the same table |
| Categories and subcategories | A subcategory's parent is another category |
| Product variations | A product variant references its base product |
| Flight connections | A connecting flight links two routes in the same flights table |
| Referral programs | A customer refers another customer |
In all these cases, one column in the table points to another row in the same table, typically by referencing that row's primary key.
Setting Up: The Employees Table
To demonstrate self joins clearly, let us create an employees table with a manager hierarchy:
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name VARCHAR(100) NOT NULL,
role VARCHAR(50) NOT NULL,
department VARCHAR(50),
salary DECIMAL(10, 2),
manager_id INTEGER,
FOREIGN KEY (manager_id) REFERENCES employees(id)
);
INSERT INTO employees VALUES (1, 'Sarah Chen', 'CEO', 'Executive', 250000, NULL);
INSERT INTO employees VALUES (2, 'Marcus Rivera', 'VP Engineering', 'Engineering', 180000, 1);
INSERT INTO employees VALUES (3, 'Linda Park', 'VP Marketing', 'Marketing', 175000, 1);
INSERT INTO employees VALUES (4, 'James Cooper', 'Senior Developer', 'Engineering', 120000, 2);
INSERT INTO employees VALUES (5, 'Aisha Patel', 'Senior Developer', 'Engineering', 125000, 2);
INSERT INTO employees VALUES (6, 'Tom Wilson', 'Junior Developer', 'Engineering', 75000, 4);
INSERT INTO employees VALUES (7, 'Nina Santos', 'Junior Developer', 'Engineering', 72000, 4);
INSERT INTO employees VALUES (8, 'Kevin Zhang', 'Marketing Manager', 'Marketing', 95000, 3);
INSERT INTO employees VALUES (9, 'Rachel Adams', 'Content Writer', 'Marketing', 65000, 8);
INSERT INTO employees VALUES (10, 'David Kim', 'SEO Specialist', 'Marketing', 70000, 8);
Let us examine the data:
SELECT id, name, role, manager_id
FROM employees
ORDER BY id;
Output:
| id | name | role | manager_id |
|---|---|---|---|
| 1 | Sarah Chen | CEO | NULL |
| 2 | Marcus Rivera | VP Engineering | 1 |
| 3 | Linda Park | VP Marketing | 1 |
| 4 | James Cooper | Senior Developer | 2 |
| 5 | Aisha Patel | Senior Developer | 2 |
| 6 | Tom Wilson | Junior Developer | 4 |
| 7 | Nina Santos | Junior Developer | 4 |
| 8 | Kevin Zhang | Marketing Manager | 3 |
| 9 | Rachel Adams | Content Writer | 8 |
| 10 | David Kim | SEO Specialist | 8 |
Notice the manager_id column. Marcus Rivera (id=2) has manager_id = 1, meaning he reports to Sarah Chen (id=1). Tom Wilson (id=6) has manager_id = 4, meaning he reports to James Cooper (id=4). Sarah Chen, as the CEO, has manager_id = NULL because she has no manager.
The hierarchy looks like this:
Sarah Chen (CEO)
├── Marcus Rivera (VP Engineering)
│ ├── James Cooper (Senior Developer)
│ │ ├── Tom Wilson (Junior Developer)
│ │ └── Nina Santos (Junior Developer)
│ └── Aisha Patel (Senior Developer)
└── Linda Park (VP Marketing)
└── Kevin Zhang (Marketing Manager)
├── Rachel Adams (Content Writer)
└── David Kim (SEO Specialist)
The challenge is that manager_id = 1 tells you nothing useful by itself. You need to look up row id = 1 in the same table to learn that the manager is Sarah Chen. This lookup is exactly what a self join does.
Your First Self Join
To join a table to itself, you include it in the FROM clause twice, each time with a different alias. This is where aliases become not just convenient but absolutely required. Without two different aliases, the database cannot distinguish between the two "copies" of the table.
SELECT e.name AS employee,
e.role,
m.name AS manager
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id;
Output:
| employee | role | manager |
|---|---|---|
| Marcus Rivera | VP Engineering | Sarah Chen |
| Linda Park | VP Marketing | Sarah Chen |
| James Cooper | Senior Developer | Marcus Rivera |
| Aisha Patel | Senior Developer | Marcus Rivera |
| Tom Wilson | Junior Developer | James Cooper |
| Nina Santos | Junior Developer | James Cooper |
| Kevin Zhang | Marketing Manager | Linda Park |
| Rachel Adams | Content Writer | Kevin Zhang |
| David Kim | SEO Specialist | Kevin Zhang |
Now you can see who reports to whom in a single, readable result. The manager_id numbers have been replaced with actual manager names.
Breaking Down the Self Join
Let us trace exactly what is happening:
FROM employees e -- First "copy": treat as the employee
INNER JOIN employees m ON e.manager_id = m.id -- Second "copy": treat as the manager
employees e: The first instance of the table, aliased ase(for "employee"). Each row represents the employee.employees m: The second instance of the same table, aliased asm(for "manager"). Each row represents the manager being looked up.ON e.manager_id = m.id: For each employee row, find the row in the "manager copy" where the id matches the employee'smanager_id.
When the database processes Tom Wilson's row (e.manager_id = 4), it searches the m copy for a row where m.id = 4. It finds James Cooper. The join combines Tom's employee data with James's manager data.
The easiest way to understand a self join is to imagine the database making two photocopies of the table. One copy is labeled "employees" and the other is labeled "managers." Then it performs a normal join between the two copies. The aliases (e and m) represent these two conceptual copies.
Both copies contain identical data. The aliases just let you reference each one independently.
Why Sarah Chen Is Missing
Notice that Sarah Chen does not appear in the results. She has manager_id = NULL, and INNER JOIN excludes rows with no match. Since no row has id = NULL, Sarah has no manager to join to.
This is often the correct behavior for an INNER JOIN: you are asking "show me employees and their managers," and the CEO has no manager. But if you want to show all employees, including those without managers, you need a LEFT JOIN.
Self Join with LEFT JOIN: Including the Top of the Hierarchy
SELECT e.name AS employee,
e.role,
COALESCE(m.name, 'No Manager (Top Level)') AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY m.name, e.name;
Output:
| employee | role | manager |
|---|---|---|
| Sarah Chen | CEO | No Manager (Top Level) |
| Nina Santos | Junior Developer | James Cooper |
| Tom Wilson | Junior Developer | James Cooper |
| David Kim | SEO Specialist | Kevin Zhang |
| Rachel Adams | Content Writer | Kevin Zhang |
| Kevin Zhang | Marketing Manager | Linda Park |
| Aisha Patel | Senior Developer | Marcus Rivera |
| James Cooper | Senior Developer | Marcus Rivera |
| Linda Park | VP Marketing | Sarah Chen |
| Marcus Rivera | VP Engineering | Sarah Chen |
Wait, let's correct the sort. COALESCE converts NULL to a string, which affects sorting. Let us use a cleaner approach:
SELECT e.name AS employee,
e.role,
e.department,
COALESCE(m.name, '—') AS manager,
COALESCE(m.role, 'Top Level') AS manager_role
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY e.department, e.name;
Output:
| employee | role | department | manager | manager_role |
|---|---|---|---|---|
| Aisha Patel | Senior Developer | Engineering | Marcus Rivera | VP Engineering |
| James Cooper | Senior Developer | Engineering | Marcus Rivera | VP Engineering |
| Marcus Rivera | VP Engineering | Engineering | Sarah Chen | CEO |
| Nina Santos | Junior Developer | Engineering | James Cooper | Senior Developer |
| Tom Wilson | Junior Developer | Engineering | James Cooper | Senior Developer |
| Sarah Chen | CEO | Executive | — | Top Level |
| David Kim | SEO Specialist | Marketing | Kevin Zhang | Marketing Manager |
| Kevin Zhang | Marketing Manager | Marketing | Linda Park | VP Marketing |
| Linda Park | VP Marketing | Marketing | Sarah Chen | CEO |
| Rachel Adams | Content Writer | Marketing | Kevin Zhang | Marketing Manager |
All 10 employees appear. Sarah Chen has no manager, displayed as "—" and "Top Level."
Finding Direct Reports
A self join also works in the opposite direction: finding who reports to a specific person.
Who Reports to Marcus Rivera?
SELECT e.name AS employee,
e.role,
e.salary
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id
WHERE m.name = 'Marcus Rivera';
Output:
| employee | role | salary |
|---|---|---|
| James Cooper | Senior Developer | 120000.00 |
| Aisha Patel | Senior Developer | 125000.00 |
Counting Direct Reports Per Manager
SELECT m.name AS manager,
m.role,
COUNT(e.id) AS direct_reports
FROM employees m
LEFT JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name, m.role
HAVING COUNT(e.id) > 0
ORDER BY direct_reports DESC;
Output:
| manager | role | direct_reports |
|---|---|---|
| Sarah Chen | CEO | 2 |
| Marcus Rivera | VP Engineering | 2 |
| James Cooper | Senior Developer | 2 |
| Kevin Zhang | Marketing Manager | 2 |
| Linda Park | VP Marketing | 1 |
Each manager has between 1 and 2 direct reports.
Complete Management Report
SELECT m.name AS manager,
m.role AS manager_role,
m.department,
COUNT(e.id) AS direct_reports,
ROUND(AVG(e.salary), 2) AS avg_report_salary,
MIN(e.salary) AS min_report_salary,
MAX(e.salary) AS max_report_salary
FROM employees m
INNER JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name, m.role, m.department
ORDER BY direct_reports DESC, avg_report_salary DESC;
Output:
| manager | manager_role | department | direct_reports | avg_report_salary | min_report_salary | max_report_salary |
|---|---|---|---|---|---|---|
| Sarah Chen | CEO | Executive | 2 | 177500.00 | 175000.00 | 180000.00 |
| Marcus Rivera | VP Engineering | Engineering | 2 | 122500.00 | 120000.00 | 125000.00 |
| James Cooper | Senior Developer | Engineering | 2 | 73500.00 | 72000.00 | 75000.00 |
| Kevin Zhang | Marketing Manager | Marketing | 2 | 67500.00 | 65000.00 | 70000.00 |
| Linda Park | VP Marketing | Marketing | 1 | 95000.00 | 95000.00 | 95000.00 |
Multi-Level Hierarchies
Self joins can be chained to traverse multiple levels of a hierarchy.
Employee, Manager, and Skip-Level Manager
SELECT e.name AS employee,
e.role,
m.name AS manager,
gm.name AS grand_manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
LEFT JOIN employees gm ON m.manager_id = gm.id
ORDER BY gm.name, m.name, e.name;
Output:
| employee | role | manager | grand_manager |
|---|---|---|---|
| James Cooper | Senior Developer | Marcus Rivera | Sarah Chen |
| Aisha Patel | Senior Developer | Marcus Rivera | Sarah Chen |
| Kevin Zhang | Marketing Manager | Linda Park | Sarah Chen |
| Tom Wilson | Junior Developer | James Cooper | Marcus Rivera |
| Nina Santos | Junior Developer | James Cooper | Marcus Rivera |
| Rachel Adams | Content Writer | Kevin Zhang | Linda Park |
| David Kim | SEO Specialist | Kevin Zhang | Linda Park |
| Marcus Rivera | VP Engineering | Sarah Chen | NULL |
| Linda Park | VP Marketing | Sarah Chen | NULL |
| Sarah Chen | CEO | NULL | NULL |
Three instances of the same table, aliased as e, m, and gm, give us a three-level view of the hierarchy.
Full Chain: Employee to CEO
For deeper hierarchies, you can keep chaining:
SELECT e.name AS employee,
m1.name AS manager,
m2.name AS directors_manager,
m3.name AS top_level
FROM employees e
LEFT JOIN employees m1 ON e.manager_id = m1.id
LEFT JOIN employees m2 ON m1.manager_id = m2.id
LEFT JOIN employees m3 ON m2.manager_id = m3.id
WHERE e.name = 'Tom Wilson';
Output:
| employee | manager | directors_manager | top_level |
|---|---|---|---|
| Tom Wilson | James Cooper | Marcus Rivera | Sarah Chen |
Tom Wilson reports to James Cooper, who reports to Marcus Rivera, who reports to Sarah Chen (CEO). The full chain is visible in one row.
Each additional LEFT JOIN handles one more level of the hierarchy, but you must know the maximum depth in advance. If your hierarchy has 5 levels, you need 4 self joins. If it has 20 levels, you need 19 self joins.
For hierarchies with unknown or variable depth, SQL offers recursive CTEs (Common Table Expressions), which are covered in an advanced guide. Recursive CTEs can traverse a hierarchy of any depth without hardcoding the number of joins.
-- Preview: Recursive CTE for unlimited hierarchy depth
WITH RECURSIVE org_chart AS (
-- Base case: start with the CEO
SELECT id, name, role, manager_id, 0 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: find each employee's reports
SELECT e.id, e.name, e.role, e.manager_id, oc.level + 1
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart ORDER BY level, name;
Self Join Use Cases Beyond Employee Hierarchies
Finding Employees with the Same Role
Compare employees who share a role to analyze salary ranges:
SELECT e1.name AS employee_a,
e2.name AS employee_b,
e1.role,
e1.salary AS salary_a,
e2.salary AS salary_b,
ABS(e1.salary - e2.salary) AS salary_difference
FROM employees e1
INNER JOIN employees e2 ON e1.role = e2.role AND e1.id < e2.id
ORDER BY e1.role, salary_difference DESC;
Output:
| employee_a | employee_b | role | salary_a | salary_b | salary_difference |
|---|---|---|---|---|---|
| Tom Wilson | Nina Santos | Junior Developer | 75000.00 | 72000.00 | 3000.00 |
| James Cooper | Aisha Patel | Senior Developer | 120000.00 | 125000.00 | 5000.00 |
The e1.id < e2.id condition prevents duplicate pairs (showing "Tom vs Nina" but not also "Nina vs Tom") and self-pairs ("Tom vs Tom").
When comparing rows within the same table, the condition e1.id < e2.id (or e1.id != e2.id for non-directional comparisons) is essential.
-- Without the id check: Self-pairs and duplicates
FROM employees e1
INNER JOIN employees e2 ON e1.role = e2.role
-- Tom vs Tom (self-pair, meaningless)
-- Tom vs Nina AND Nina vs Tom (duplicate pair)
-- With e1.id < e2.id: Clean pairs only
FROM employees e1
INNER JOIN employees e2 ON e1.role = e2.role AND e1.id < e2.id
-- Only Tom vs Nina (one entry per unique pair)
Finding Employees in the Same Department
SELECT e1.name AS employee_a,
e2.name AS employee_b,
e1.department
FROM employees e1
INNER JOIN employees e2
ON e1.department = e2.department
AND e1.id < e2.id
WHERE e1.department = 'Engineering'
ORDER BY e1.name, e2.name;
Output:
| employee_a | employee_b | department |
|---|---|---|
| Aisha Patel | Nina Santos | Engineering |
| Aisha Patel | Tom Wilson | Engineering |
| James Cooper | Aisha Patel | Engineering |
| James Cooper | Nina Santos | Engineering |
| James Cooper | Tom Wilson | Engineering |
| Marcus Rivera | Aisha Patel | Engineering |
| Marcus Rivera | James Cooper | Engineering |
| Marcus Rivera | Nina Santos | Engineering |
| Marcus Rivera | Tom Wilson | Engineering |
| Aisha Patel | Tom Wilson | Engineering |
Every possible pair of Engineering employees appears exactly once.
Products in the Same Category and Price Range
Self joins work on any table, not just employee hierarchies. Let us use the ShopSmart products table:
SELECT p1.name AS product_a,
p2.name AS product_b,
cat.name AS category,
p1.price AS price_a,
p2.price AS price_b,
ABS(p1.price - p2.price) AS price_gap
FROM products p1
INNER JOIN products p2
ON p1.category_id = p2.category_id
AND p1.id < p2.id
INNER JOIN categories cat ON p1.category_id = cat.id
WHERE ABS(p1.price - p2.price) < 20
ORDER BY price_gap ASC;
Output:
| product_a | product_b | category | price_a | price_b | price_gap |
|---|---|---|---|---|---|
| SQL for Beginners | Data Science Handbook | Books | 34.99 | 42.50 | 7.509999999999998 |
| Yoga Mat Premium | Stainless Water Bottle | Sports | 38.00 | 24.99 | 13.010000000000002 |
| Wireless Mouse | USB-C Hub | Electronics | 29.99 | 45.00 | 15.010000000000002 |
These are product pairs in the same category with a price difference under $20, exactly the kind of data a "similar products" recommendation engine would use.
Finding Sequential Orders by the Same Customer
SELECT o1.id AS earlier_order,
o2.id AS later_order,
c.first_name || ' ' || c.last_name AS customer,
o1.order_date AS first_date,
o2.order_date AS second_date,
o2.order_date - o1.order_date AS days_between
FROM orders o1
INNER JOIN orders o2
ON o1.customer_id = o2.customer_id
AND o1.order_date < o2.order_date
INNER JOIN customers c ON o1.customer_id = c.id
ORDER BY customer, o1.order_date;
Output:
| earlier_order | later_order | customer | first_date | second_date | days_between |
|---|---|---|---|---|---|
| 1 | 3 | Alice Johnson | 2024-01-10 | 2024-02-20 | 41 |
| 1 | 9 | Alice Johnson | 2024-01-10 | 2024-04-15 | 96 |
| 3 | 9 | Alice Johnson | 2024-02-20 | 2024-04-15 | 55 |
| 2 | 7 | Bob Martinez | 2024-01-15 | 2024-04-01 | 77 |
| 2 | 11 | Bob Martinez | 2024-01-15 | 2024-04-22 | 98 |
| 7 | 11 | Bob Martinez | 2024-04-01 | 2024-04-22 | 21 |
| 4 | 10 | Carol Singh | 2024-03-05 | 2024-04-18 | 44 |
| 5 | 12 | David Chen | 2024-03-12 | 2024-04-25 | 44 |
This shows every pair of orders by the same customer and how many days elapsed between them, useful for understanding customer purchase frequency.
The o2.order_date - o1.order_date subtraction works in PostgreSQL, returning an integer number of days. In other databases:
-- MySQL
DATEDIFF(o2.order_date, o1.order_date)
-- SQL Server
DATEDIFF(day, o1.order_date, o2.order_date)
-- SQLite
JULIANDAY(o2.order_date) - JULIANDAY(o1.order_date)
Self Join with Aggregation
Self joins combine naturally with GROUP BY and aggregate functions for analytical queries.
Average Salary Compared to Manager's Salary
SELECT e.name AS employee,
e.role,
e.salary AS employee_salary,
m.salary AS manager_salary,
e.salary - m.salary AS salary_difference,
ROUND(e.salary * 100.0 / m.salary, 1) AS pct_of_manager
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id
ORDER BY pct_of_manager DESC;
Output:
| employee | role | employee_salary | manager_salary | salary_difference | pct_of_manager |
|---|---|---|---|---|---|
| David Kim | SEO Specialist | 70000 | 95000 | -25000 | 73.7 |
| Marcus Rivera | VP Engineering | 180000 | 250000 | -70000 | 72.0 |
| Linda Park | VP Marketing | 175000 | 250000 | -75000 | 70.0 |
| Aisha Patel | Senior Developer | 125000 | 180000 | -55000 | 69.4 |
| Rachel Adams | Content Writer | 65000 | 95000 | -30000 | 68.4 |
| James Cooper | Senior Developer | 120000 | 180000 | -60000 | 66.7 |
| Tom Wilson | Junior Developer | 75000 | 120000 | -45000 | 62.5 |
| Nina Santos | Junior Developer | 72000 | 120000 | -48000 | 60.0 |
| Kevin Zhang | Marketing Manager | 95000 | 175000 | -80000 | 54.3 |
Team Cost Analysis
SELECT m.name AS manager,
m.role,
m.salary AS manager_salary,
COUNT(e.id) AS team_size,
SUM(e.salary) AS team_payroll,
SUM(e.salary) + m.salary AS total_cost,
ROUND(AVG(e.salary), 2) AS avg_team_salary
FROM employees m
INNER JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name, m.role, m.salary
ORDER BY total_cost DESC;
Output:
| employee | role | manager_salary | team_size | team_payroll | total_cost | avg_team_salary |
|---|---|---|---|---|---|---|
| Sarah Chen | CEO | 250000.00 | 2 | 355000.00 | 605000.00 | 177500.00 |
| Marcus Rivera | VP Engineering | 180000.00 | 2 | 245000.00 | 425000.00 | 122500.00 |
| Linda Park | VP Marketing | 175000.00 | 1 | 95000.00 | 270000.00 | 95000.00 |
| James Cooper | Senior Developer | 120000.00 | 2 | 147000.00 | 267000.00 | 73500.00 |
| Kevin Zhang | Marketing Manager | 95000.00 | 2 | 135000.00 | 230000.00 | 67500.00 |
Employees Who Earn More Than Their Manager
A classic self join question:
SELECT e.name AS employee,
e.role AS employee_role,
e.salary AS employee_salary,
m.name AS manager,
m.salary AS manager_salary
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
Output:
(empty result set)
No one earns more than their manager in our sample data, which is realistic. But if we adjust the scenario:
-- Temporarily give Aisha a raise above her VP
UPDATE employees SET salary = 185000 WHERE name = 'Aisha Patel';
SELECT e.name AS employee,
e.salary AS employee_salary,
m.name AS manager,
m.salary AS manager_salary,
e.salary - m.salary AS overpayment
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
Output:
| employee | employee_salary | manager | manager_salary | overpayment |
|---|---|---|---|---|
| Aisha Patel | 185000.00 | Marcus Rivera | 180000.00 | 5000.00 |
-- Reset Aisha's salary
UPDATE employees SET salary = 125000 WHERE name = 'Aisha Patel';
This is a common audit query in HR departments.
Employees Without Direct Reports
Using a LEFT JOIN self join to find leaf nodes (employees who manage nobody):
SELECT e.name AS employee,
e.role,
e.department
FROM employees e
LEFT JOIN employees report ON e.id = report.manager_id
WHERE report.id IS NULL
ORDER BY e.department, e.name;
Output:
| employee | role | department |
|---|---|---|
| Aisha Patel | Senior Developer | Engineering |
| Nina Santos | Junior Developer | Engineering |
| Tom Wilson | Junior Developer | Engineering |
| David Kim | SEO Specialist | Marketing |
| Rachel Adams | Content Writer | Marketing |
These are the "individual contributors" who do not manage anyone. The LEFT JOIN looks for employees who reference this person as their manager. When none are found (report.id IS NULL), the employee is a leaf node.
Self Join on the ShopSmart Products Table
Self joins are not limited to hierarchical data. Let us explore some non-hierarchical self join patterns using the products table.
Finding More Expensive Alternatives in the Same Category
SELECT p1.name AS current_product,
p1.price AS current_price,
p2.name AS upgrade_option,
p2.price AS upgrade_price,
p2.price - p1.price AS price_increase
FROM products p1
INNER JOIN products p2
ON p1.category_id = p2.category_id
AND p2.price > p1.price
WHERE p1.is_available = true
AND p2.is_available = true
ORDER BY p1.name, price_increase ASC;
Output:
| current_product | current_price | upgrade_option | upgrade_price | price_increase |
|---|---|---|---|---|
| SQL for Beginners | 34.99 | Data Science Handbook | 42.50 | 7.509999999999998 |
| Stainless Water Bottle | 24.99 | Yoga Mat Premium | 38.00 | 13.010000000000002 |
| Stainless Water Bottle | 24.99 | Running Shoes X1 | 110.00 | 85.01 |
| USB-C Hub | 45.00 | Mechanical Keyboard | 89.99 | 44.989999999999995 |
| Wireless Mouse | 29.99 | USB-C Hub | 45.00 | 15.010000000000002 |
| Wireless Mouse | 29.99 | Mechanical Keyboard | 89.99 | 60.00 |
| Yoga Mat Premium | 38.00 | Running Shoes X1 | 110.00 | 72.00 |
This query powers an "upgrade your purchase" recommendation feature.
Finding the Next Most Expensive Product in Any Category
SELECT p1.name AS product,
p1.price,
p2.name AS next_cheapest_above,
p2.price AS next_price
FROM products p1
INNER JOIN products p2
ON p2.price > p1.price
WHERE p2.price = (
SELECT MIN(p3.price)
FROM products p3
WHERE p3.price > p1.price
)
ORDER BY p1.price;
Output:
| product | price | next_cheapest_above | next_price |
|---|---|---|---|
| Stainless Water Bottle | 24.99 | Wireless Mouse | 29.99 |
| Wireless Mouse | 29.99 | SQL for Beginners | 34.99 |
| SQL for Beginners | 34.99 | Yoga Mat Premium | 38.00 |
| Yoga Mat Premium | 38.00 | Data Science Handbook | 42.50 |
| Data Science Handbook | 42.50 | USB-C Hub | 45.00 |
| USB-C Hub | 45.00 | Bluetooth Speaker | 65.00 |
| Bluetooth Speaker | 65.00 | Mechanical Keyboard | 89.99 |
| Mechanical Keyboard | 89.99 | Running Shoes X1 | 110.00 |
| Running Shoes X1 | 110.00 | Coffee Maker Pro | 129.99 |
Each product is paired with the product that costs just slightly more.
Common Self Join Mistakes
Mistake 1: Forgetting the Self-Pair Prevention
-- Wrong: Includes self-pairs (Tom Wilson compared with Tom Wilson)
SELECT e1.name, e2.name
FROM employees e1
INNER JOIN employees e2 ON e1.department = e2.department;
-- Includes: Tom Wilson | Tom Wilson (meaningless)
-- Correct: Exclude self-pairs
SELECT e1.name, e2.name
FROM employees e1
INNER JOIN employees e2 ON e1.department = e2.department
WHERE e1.id != e2.id;
-- Or use e1.id < e2.id to also prevent reverse duplicates
Mistake 2: Using the Same Alias for Both Instances
-- Wrong: Cannot use the same alias twice
SELECT e.name, e.name
FROM employees e
INNER JOIN employees e ON e.manager_id = e.id;
-- ERROR: table name "e" specified more than once
-- Correct: Each instance needs a unique alias
SELECT e.name, m.name
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id;
Mistake 3: Joining on the Wrong Columns
-- Wrong: Joining employee id to employee id (every row matches itself)
SELECT e.name, m.name
FROM employees e
INNER JOIN employees m ON e.id = m.id;
-- This just returns each employee paired with themselves
-- Correct: Join the foreign key to the primary key
SELECT e.name, m.name
FROM employees e
INNER JOIN employees m ON e.manager_id = m.id;
Self joins are conceptually tricky because the same table plays two different roles. Before running a self join, ask yourself:
- Which alias represents the primary entity I am interested in?
- Which alias represents the related entity I am looking up?
- Which column in the primary entity references a row via the related entity's primary key?
- Do I need to prevent self-pairs (
e1.id != e2.id)? - Do I need to prevent duplicate pairs (
e1.id < e2.id)?
Practical Exercises
Exercise 1
Write a query that shows each employee with their manager's name and department.
SELECT e.name AS employee,
e.department AS employee_dept,
COALESCE(m.name, 'None') AS manager,
COALESCE(m.department, 'N/A') AS manager_dept
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY e.department, e.name;
| employee | employee_dept | manager | manager_dept |
|---|---|---|---|
| Aisha Patel | Engineering | Marcus Rivera | Engineering |
| James Cooper | Engineering | Marcus Rivera | Engineering |
| Marcus Rivera | Engineering | Sarah Chen | Executive |
| Nina Santos | Engineering | James Cooper | Engineering |
| Tom Wilson | Engineering | James Cooper | Engineering |
| Sarah Chen | Executive | None | N/A |
| David Kim | Marketing | Kevin Zhang | Marketing |
| Kevin Zhang | Marketing | Linda Park | Marketing |
| Linda Park | Marketing | Sarah Chen | Executive |
| Rachel Adams | Marketing | Kevin Zhang | Marketing |
Exercise 2
Find all pairs of employees in the same department who have a salary difference of less than $10,000.
SELECT e1.name AS employee_a,
e2.name AS employee_b,
e1.department,
e1.salary AS salary_a,
e2.salary AS salary_b,
ABS(e1.salary - e2.salary) AS salary_gap
FROM employees e1
INNER JOIN employees e2
ON e1.department = e2.department
AND e1.id < e2.id
WHERE ABS(e1.salary - e2.salary) < 10000
ORDER BY salary_gap;
Expected output:
| employee_a | employee_b | department | salary_a | salary_b | salary_gap |
|---|---|---|---|---|---|
| Tom Wilson | Nina Santos | Engineering | 75000.00 | 72000.00 | 3000.00 |
| James Cooper | Aisha Patel | Engineering | 180000.00 | 185000.00 | 5000.00 |
| Rachel Adams | David Kim | Marketing | 65000.00 | 70000.00 | 5000.00 |
Exercise 3
Find employees who are at the bottom of the hierarchy (manage nobody) and show how many levels above them the CEO is.
SELECT e.name AS employee,
e.role,
m1.name AS manager,
m2.name AS skip_level,
m3.name AS top_level
FROM employees e
LEFT JOIN employees report ON e.id = report.manager_id
LEFT JOIN employees m1 ON e.manager_id = m1.id
LEFT JOIN employees m2 ON m1.manager_id = m2.id
LEFT JOIN employees m3 ON m2.manager_id = m3.id
WHERE report.id IS NULL
ORDER BY e.name;
| employee | role | manager | skip_level | top_level |
|---|---|---|---|---|
| David Kim | SEO Specialist | Kevin Zhang | Linda Park | Sarah Chen |
| Nina Santos | Junior Developer | James Cooper | Marcus Rivera | Sarah Chen |
| Rachel Adams | Content Writer | Kevin Zhang | Linda Park | Sarah Chen |
| Tom Wilson | Junior Developer | James Cooper | Marcus Rivera | Sarah Chen |
| Aisha Patel | Senior Developer | Marcus Rivera | Sarah Chen | null |
Exercise 4
Using the ShopSmart products table, find all pairs of products in the same category where both products are available and both have been reviewed.
SELECT p1.name AS product_a,
p2.name AS product_b,
cat.name AS category
FROM products p1
INNER JOIN products p2
ON p1.category_id = p2.category_id
AND p1.id < p2.id
INNER JOIN categories cat ON p1.category_id = cat.id
WHERE p1.is_available = true
AND p2.is_available = true
AND EXISTS (SELECT 1 FROM reviews WHERE product_id = p1.id)
AND EXISTS (SELECT 1 FROM reviews WHERE product_id = p2.id)
ORDER BY cat.name, p1.name;
| product_a | product_b | category |
|---|---|---|
| SQL for Beginners | Data Science Handbook | Books |
| Mechanical Keyboard | USB-C Hub | Electronics |
| Wireless Mouse | Mechanical Keyboard | Electronics |
| Wireless Mouse | USB-C Hub | Electronics |
| Running Shoes X1 | Stainless Water Bottle | Sports |
| Yoga Mat Premium | Running Shoes X1 | Sports |
| Yoga Mat Premium | Stainless Water Bottle | Sports |
Exercise 5
Create an organization chart query that shows each manager with a comma-separated list of their direct reports' names.
-- PostgreSQL syntax
SELECT m.name AS manager,
m.role,
STRING_AGG(e.name, ', ' ORDER BY e.name) AS direct_reports
FROM employees m
INNER JOIN employees e ON m.id = e.manager_id
GROUP BY m.id, m.name, m.role
ORDER BY m.name;
Expected output:
| manager | role | direct_reports |
|---|---|---|
| James Cooper | Senior Developer | Nina Santos, Tom Wilson |
| Kevin Zhang | Marketing Manager | David Kim, Rachel Adams |
| Linda Park | VP Marketing | Kevin Zhang |
| Marcus Rivera | VP Engineering | Aisha Patel, James Cooper |
| Sarah Chen | CEO | Linda Park, Marcus Rivera |
Cleanup
If you want to keep the employees table for future practice, leave it. Otherwise:
-- Remove the employees table if no longer needed
-- DROP TABLE IF EXISTS employees;
Key Takeaways
Self joins are a natural extension of regular joins, applied when the data you need to connect lives within a single table. Here is what you should remember:
- A self join joins a table to itself, using two different aliases to treat it as two separate tables
- Self joins use the same
INNER JOIN,LEFT JOIN, and other join syntax you already know. There is no special "self join" keyword - Aliases are mandatory in self joins. Without two distinct aliases, the database cannot distinguish between the two instances of the table
- The most common self join use case is hierarchical data: employees and managers, categories and subcategories, comments and replies
- Use
INNER JOINto show only rows with a match (employees who have a manager) - Use
LEFT JOINto include all rows (including the CEO who has no manager, or leaf employees who manage nobody) - For comparison queries (finding peers, similar products), use
e1.id < e2.idto prevent self-pairs and duplicate pairs - Self joins can be chained to traverse multiple hierarchy levels, but each level requires another join
- For unlimited hierarchy depth, recursive CTEs are more appropriate than chained self joins
- Self joins work on any self-referencing relationship, not just employee hierarchies: product variants, comment threads, referral chains, flight connections, and more
- Always verify your join condition and alias roles before running a self join. The same table playing two roles can be confusing if the logic is not clear
Self joins complete your understanding of join mechanics. You now know how to combine data across different tables (INNER JOIN, LEFT JOIN, FULL OUTER JOIN, CROSS JOIN) and within the same table (self joins). This comprehensive join knowledge equips you to query virtually any relational database structure you will encounter.