How to Read Execution Plans with SQL EXPLAIN and SQL EXPLAIN ANALYZE
You have written a query. It returns the correct results. But it takes 45 seconds to run on production data, and you have no idea why. Is it scanning millions of rows unnecessarily? Is it ignoring your carefully created index? Is the join strategy wrong? Without looking at the execution plan, you are guessing in the dark.
SQL EXPLAIN is the tool that lets you see exactly what the database is doing under the hood. It reveals the step-by-step strategy the query optimizer chose: which tables are scanned first, which indexes are used (or ignored), how tables are joined, where sorting happens, and where time is actually spent. Learning to read execution plans is arguably the single most valuable skill for SQL performance tuning.
This guide teaches you how to generate execution plans, how to read the most common node types (sequential scans, index scans, joins, sorts), and how to identify bottlenecks that are making your queries slow. Every concept is demonstrated with practical examples and annotated outputs so you can start applying this immediately.
Why Execution Plans Matter
When you submit a SQL query, the database does not just execute it blindly. The query optimizer evaluates potentially thousands of different strategies for retrieving your data and picks the one it estimates will be fastest. The execution plan is the optimizer's chosen strategy.
Without EXPLAIN, performance tuning is guesswork:
- "Maybe I need an index?" (Which column? Is the current one being used?)
- "Maybe the join is slow?" (Which join? What strategy is it using?)
- "Maybe the table is too big?" (Is it scanning the whole thing or just a portion?)
With EXPLAIN, you get concrete answers to all of these questions.
The Sample Data
We will use these tables throughout the guide. Imagine them at production scale with hundreds of thousands to millions of rows.
customers table (~100,000 rows):
| id | name | country | signup_date | |
|---|---|---|---|---|
| 1 | Alice | alice@example.com | US | 2022-03-15 |
| 2 | Bob | bob@mail.co | UK | 2022-06-01 |
| ... | ... | ... | ... | ... |
orders table (~2,000,000 rows):
| id | customer_id | order_date | amount | status |
|---|---|---|---|---|
| 1 | 1 | 2024-01-05 | 250.00 | completed |
| 2 | 2 | 2024-01-12 | 430.00 | completed |
| ... | ... | ... | ... | ... |
Existing indexes:
-- Primary keys (automatic)
-- customers(id), orders(id)
-- Manually created
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_customers_country ON customers(country);
How to Generate an Execution Plan
EXPLAIN (Estimated Plan)
EXPLAIN shows the optimizer's planned strategy without actually running the query. It is fast and safe to use on any query, even expensive ones.
-- PostgreSQL / MySQL / SQLite
EXPLAIN
SELECT * FROM orders WHERE customer_id = 42;
-- SQL Server
SET SHOWPLAN_TEXT ON;
GO
SELECT * FROM orders WHERE customer_id = 42;
GO
SET SHOWPLAN_TEXT OFF;
-- Oracle
EXPLAIN PLAN FOR
SELECT * FROM orders WHERE customer_id = 42;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
EXPLAIN ANALYZE (Actual Execution)
EXPLAIN ANALYZE actually runs the query and shows both the estimated plan and the real execution statistics (actual time, actual row counts). This is the most informative version.
-- PostgreSQL
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
-- MySQL
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
EXPLAIN ANALYZE executes the query. For SELECT statements this is harmless (it just discards the results). But be extremely careful with INSERT, UPDATE, or DELETE:
-- DANGEROUS: This will actually delete rows!
EXPLAIN ANALYZE DELETE FROM orders WHERE status = 'cancelled';
To safely analyze a write query, wrap it in a transaction and roll back:
-- Safe approach
BEGIN;
EXPLAIN ANALYZE DELETE FROM orders WHERE status = 'cancelled';
ROLLBACK; -- Undo the delete
PostgreSQL Output Formats
PostgreSQL supports several output formats for EXPLAIN. The text format is the default and what we will use throughout this guide, but other formats are available:
-- Default text format
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
-- JSON format (great for programmatic parsing)
EXPLAIN (FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42;
-- YAML format
EXPLAIN (FORMAT YAML) SELECT * FROM orders WHERE customer_id = 42;
-- Include all available information
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42;
The BUFFERS option (PostgreSQL) shows how many disk pages were read, which is invaluable for understanding I/O-heavy queries.
Reading Your First Execution Plan
Let's start with a simple query and break down every piece of the plan.
Simple Query with Index
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
Output (PostgreSQL):
Index Scan using idx_orders_customer_id on orders (cost=0.43..28.12 rows=15 width=36)
Index Cond: (customer_id = 42)
Planning Time: 0.085 ms
Execution Time: 0.142 ms
Let's decode each part:
| Component | Meaning |
|---|---|
Index Scan | The operation type: the database is using an index to find rows |
using idx_orders_customer_id | Which index is being used |
on orders | Which table is being accessed |
cost=0.43..28.12 | Estimated cost: startup cost (0.43) and total cost (28.12) in arbitrary units |
rows=15 | Estimated row count: the optimizer expects to find about 15 matching rows |
width=36 | Estimated row width: each row is approximately 36 bytes |
Index Cond: (customer_id = 42) | The condition used to search the index |
Planning Time: 0.085 ms | Time spent planning (choosing the strategy) |
Execution Time: 0.142 ms | Time spent executing (actually finding and returning rows) |
When EXPLAIN ANALYZE is used, you also see actual values:
Index Scan using idx_orders_customer_id on orders
(cost=0.43..28.12 rows=15 width=36)
(actual time=0.025..0.098 rows=12 loops=1)
Index Cond: (customer_id = 42)
Planning Time: 0.085 ms
Execution Time: 0.142 ms
| Component | Meaning |
|---|---|
actual time=0.025..0.098 | Real time in milliseconds: first row at 0.025ms, last row at 0.098ms |
rows=12 | Actual rows returned (vs. 15 estimated) |
loops=1 | This node executed once |
Comparing rows=15 (estimated) with rows=12 (actual) tells you how accurate the optimizer's statistics are. Large discrepancies (e.g., estimated 100, actual 50,000) indicate stale statistics and can cause the optimizer to choose a bad plan. Run ANALYZE (PostgreSQL) or UPDATE STATISTICS (SQL Server) to refresh them:
-- PostgreSQL: Update statistics for a table
ANALYZE orders;
-- SQL Server
UPDATE STATISTICS orders;
-- MySQL
ANALYZE TABLE orders;
Sequential Scan vs Index Scan
The two most fundamental scan types are the sequential scan (full table scan) and the index scan. Understanding when each is used and why is the core of execution plan reading.
Sequential Scan (Seq Scan)
A sequential scan reads every row in the table from beginning to end, checking each one against the WHERE condition. It is the database equivalent of reading a book page by page.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE amount > 100;
Output:
Seq Scan on orders (cost=0.00..45218.00 rows=1850000 width=36)
(actual time=0.015..312.450 rows=1823456 loops=1)
Filter: (amount > 100)
Rows Removed by Filter: 176544
Planning Time: 0.065 ms
Execution Time: 458.230 ms
Key observations:
Seq Scan: Full table scan. Every row is examined.Filter: (amount > 100): The condition is applied after reading each row. This is different fromIndex Cond, where the condition is used to navigate the index.Rows Removed by Filter: 176544: About 176K rows were read but did not match the condition. This is wasted work.rows=1823456: The query matches ~1.8 million rows out of ~2 million. This is why the optimizer chose a sequential scan: when you are returning most of the table, scanning it sequentially is faster than bouncing around through an index.
Index Scan
An index scan uses a B-tree index to navigate directly to matching rows, then fetches the full row data from the table.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
Output:
Index Scan using idx_orders_customer_id on orders (cost=0.43..28.12 rows=15 width=36)
(actual time=0.025..0.098 rows=12 loops=1)
Index Cond: (customer_id = 42)
Planning Time: 0.085 ms
Execution Time: 0.142 ms
Index Scan: The database navigated the B-tree to find matching rows.Index Cond: The condition was used to search the index, not just filter after reading.- 0.142 ms vs 458 ms for the sequential scan. The difference is enormous.
Index-Only Scan
An index-only scan is the fastest scan type. The index contains all the columns the query needs, so the database never visits the table at all.
EXPLAIN ANALYZE
SELECT customer_id, order_date FROM orders WHERE customer_id = 42;
If a covering index exists on (customer_id, order_date):
Index Only Scan using idx_orders_cust_date on orders (cost=0.43..18.50 rows=15 width=12)
(actual time=0.018..0.055 rows=12 loops=1)
Index Cond: (customer_id = 42)
Heap Fetches: 0
Planning Time: 0.072 ms
Execution Time: 0.089 ms
Index Only Scan: Everything is read from the index. No table access.Heap Fetches: 0: Confirms zero visits to the actual table. If this number is high, runVACUUM(PostgreSQL) to update the visibility map.
Bitmap Index Scan
A bitmap index scan is a hybrid approach. The database first scans the index to build a "bitmap" of which table pages contain matching rows, then reads those pages. This is efficient when the index matches many rows spread across many pages.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-01-31';
Output:
Bitmap Heap Scan on orders (cost=1250.00..35420.00 rows=165000 width=36)
(actual time=12.500..185.300 rows=162340 loops=1)
Recheck Cond: (order_date >= '2024-01-01' AND order_date <= '2024-01-31')
Heap Blocks: exact=28450
-> Bitmap Index Scan on idx_orders_date (cost=0.00..1208.75 rows=165000 width=0)
(actual time=10.200..10.200 rows=162340 loops=1)
Index Cond: (order_date >= '2024-01-01' AND order_date <= '2024-01-31')
Planning Time: 0.095 ms
Execution Time: 215.800 ms
This plan has two nodes:
Bitmap Index Scan(inner/child): Scans the index and builds a bitmap of matching page locations.Bitmap Heap Scan(outer/parent): Reads the table pages identified by the bitmap.
The advantage over a regular index scan is reduced random I/O: instead of bouncing to each matching row individually, the bitmap groups matches by page and reads each page once.
Scan Type Summary
| Scan Type | When Used | Speed | Table Access |
|---|---|---|---|
| Seq Scan | No useful index, or most rows match | Slow for selective queries, fine for bulk reads | Reads every row |
| Index Scan | Index exists, few rows match | Fast | Index lookup + table row fetch |
| Index Only Scan | Index covers all needed columns | Fastest | Index only, no table access |
| Bitmap Index Scan | Index exists, moderate number of rows match | Medium | Index bitmap + batched table reads |
Reading Multi-Node Plans
Real queries produce plans with multiple nodes arranged in a tree structure. Data flows from the innermost (bottom) nodes up to the outermost (top) node. The top node produces the final result.
Join Plans
EXPLAIN ANALYZE
SELECT c.name, o.order_date, o.amount
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE c.country = 'US';
Output:
Hash Join (cost=2850.00..52340.00 rows=420000 width=28)
(actual time=25.300..385.600 rows=415230 loops=1)
Hash Cond: (o.customer_id = c.id)
-> Seq Scan on orders o (cost=0.00..35218.00 rows=2000000 width=16)
(actual time=0.010..120.500 rows=2000000 loops=1)
-> Hash (cost=2450.00..2450.00 rows=32000 width=20)
(actual time=24.800..24.800 rows=31500 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 1850kB
-> Seq Scan on customers c (cost=0.00..2450.00 rows=32000 width=20)
(actual time=0.012..15.300 rows=31500 loops=1)
Filter: (country = 'US')
Rows Removed by Filter: 68500
Planning Time: 0.250 ms
Execution Time: 425.800 ms
Read this plan bottom-up:
- Seq Scan on customers c (innermost, bottom): Scans the entire
customerstable, filters forcountry = 'US', finds 31,500 matching rows. - Hash: Builds a hash table in memory from those 31,500 customer rows (using 1,850 KB of memory).
- Seq Scan on orders o: Scans the entire
orderstable (2 million rows). - Hash Join (top): For each order row, probes the hash table to find a matching customer. Produces 415,230 result rows.
Common Join Strategies
| Join Type | How It Works | Best When |
|---|---|---|
| Nested Loop | For each row in the outer table, scan the inner table (often via index) | Small outer table, indexed inner table |
| Hash Join | Build a hash table from one table, probe it with the other | Medium-to-large tables, no useful index for the join |
| Merge Join | Sort both tables on the join key, then merge them | Both tables already sorted (or indexed) on the join key |
-- Nested Loop example: small result from customers, index on orders
EXPLAIN ANALYZE
SELECT c.name, o.amount
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE c.email = 'alice@example.com';
Output:
Nested Loop (cost=0.86..52.30 rows=20 width=18)
(actual time=0.035..0.180 rows=18 loops=1)
-> Index Scan using idx_customers_email on customers c (cost=0.42..8.44 rows=1 width=10)
(actual time=0.020..0.022 rows=1 loops=1)
Index Cond: (email = 'alice@example.com')
-> Index Scan using idx_orders_customer_id on orders o (cost=0.43..43.50 rows=20 width=12)
(actual time=0.012..0.145 rows=18 loops=1)
Index Cond: (customer_id = c.id)
Planning Time: 0.180 ms
Execution Time: 0.225 ms
This is an efficient nested loop: the outer scan finds 1 customer via index, then the inner scan finds that customer's 18 orders via index. Total time: 0.225 ms.
Sort and Aggregate Nodes
EXPLAIN ANALYZE
SELECT customer_id, COUNT(*), SUM(amount)
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY customer_id
ORDER BY SUM(amount) DESC
LIMIT 10;
Output:
Limit (cost=48500.00..48500.02 rows=10 width=20)
(actual time=310.500..310.510 rows=10 loops=1)
-> Sort (cost=48500.00..48750.00 rows=100000 width=20)
(actual time=310.498..310.505 rows=10 loops=1)
Sort Key: (sum(amount)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=45200.00..46200.00 rows=100000 width=20)
(actual time=280.300..298.500 rows=98500 loops=1)
Group Key: customer_id
Batches: 1 Memory Usage: 14337kB
-> Seq Scan on orders (cost=0.00..40218.00 rows=1650000 width=12)
(actual time=0.012..95.800 rows=1648200 loops=1)
Filter: (order_date >= '2024-01-01')
Rows Removed by Filter: 351800
Planning Time: 0.125 ms
Execution Time: 310.650 ms
Reading bottom-up:
- Seq Scan on orders: Reads all rows, filters by date, produces 1.65M matching rows.
- HashAggregate: Groups by
customer_id, computesCOUNT(*)andSUM(amount), producing 98,500 groups. Uses 14 MB of memory. - Sort: Sorts the 98,500 groups by
SUM(amount) DESC. Uses atop-N heapsortbecause it only needs the top 10. - Limit: Takes only the first 10 rows from the sorted output.
Identifying Bottlenecks
Now that you can read plans, let's focus on finding and fixing the slow parts.
Bottleneck 1: Unexpected Sequential Scan
The most common performance problem is a sequential scan where an index scan should be used.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
Bad plan (index not used):
Seq Scan on orders (cost=0.00..45218.00 rows=15 width=36)
(actual time=0.015..285.300 rows=12 loops=1)
Filter: (customer_id = 42)
Rows Removed by Filter: 1999988
The database scanned 2 million rows to find 12. This is a massive waste.
Common causes and fixes:
| Cause | Fix |
|---|---|
No index exists on customer_id | CREATE INDEX idx_orders_customer_id ON orders(customer_id) |
| Statistics are stale | ANALYZE orders; (PostgreSQL) |
| Column wrapped in function | Rewrite query or create expression index |
| Implicit type cast | Ensure matching data types in the comparison |
After adding the index:
Index Scan using idx_orders_customer_id on orders (cost=0.43..28.12 rows=15 width=36)
(actual time=0.025..0.098 rows=12 loops=1)
Index Cond: (customer_id = 42)
From 285ms to 0.098ms. That is a 2,900x improvement.
Bottleneck 2: High Rows Removed by Filter
When you see a large Rows Removed by Filter count, it means the database read many rows and then threw most of them away.
Seq Scan on orders (cost=0.00..45218.00 rows=50000 width=36)
(actual time=0.010..295.400 rows=48500 loops=1)
Filter: (status = 'pending' AND amount > 200)
Rows Removed by Filter: 1951500
Nearly 2 million rows were read but only 48,500 were kept. If this query runs frequently, an index would help:
CREATE INDEX idx_orders_status_amount ON orders(status, amount);
Bottleneck 3: Large Estimated vs Actual Row Discrepancy
Nested Loop (cost=0.43..150.00 rows=5 width=28)
(actual time=0.030..4850.200 rows=125000 loops=1)
The optimizer estimated 5 rows but got 125,000. It chose a nested loop because it expected very few iterations. With 125,000 actual iterations, the nested loop is catastrophically slow.
Fix: Update the table statistics so the optimizer can make better estimates:
-- PostgreSQL
ANALYZE orders;
-- PostgreSQL: More thorough statistics for a specific column
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 1000;
ANALYZE orders;
After refreshing statistics, the optimizer might switch to a hash join, which handles large row counts much better.
Bottleneck 4: Disk-Based Sorts
Sort (cost=185000.00..190000.00 rows=2000000 width=36)
(actual time=2850.300..3420.500 rows=2000000 loops=1)
Sort Key: order_date
Sort Method: external merge Disk: 95200kB
Sort Method: external merge Disk: 95200kB means the data did not fit in memory and was sorted on disk. Disk sorts are dramatically slower than in-memory sorts.
Fixes:
- Increase work memory (PostgreSQL):
SET work_mem = '256MB';
-- Then re-run the query
- Add an index on the sort column so the database reads data in sorted order:
CREATE INDEX idx_orders_date ON orders(order_date);
- Reduce the data being sorted by adding more selective filters.
Bottleneck 5: Nested Loop with High Loops Count
Nested Loop (cost=0.43..850000.00 rows=500000 width=28)
(actual time=0.050..12500.300 rows=485000 loops=1)
-> Seq Scan on customers c (cost=0.00..2450.00 rows=100000 width=12)
(actual time=0.010..18.500 rows=100000 loops=1)
-> Index Scan using idx_orders_customer_id on orders o (cost=0.43..8.20 rows=5 width=16)
(actual time=0.008..0.095 rows=5 loops=100000)
The inner index scan has loops=100000. Each loop is fast (0.095ms), but 100,000 of them add up to 9.5 seconds. A hash join would be more efficient here:
-- Hint PostgreSQL to disable nested loops and try a hash join
SET enable_nestloop = off;
EXPLAIN ANALYZE SELECT ...;
SET enable_nestloop = on; -- Reset to default
Disabling join strategies with SET enable_nestloop = off is a diagnostic tool, not a permanent solution. Use it to confirm that a different join strategy is faster, then investigate why the optimizer did not choose it (usually stale statistics or missing indexes).
MySQL EXPLAIN Output
MySQL's EXPLAIN output is formatted differently. Here is how to read it:
EXPLAIN SELECT c.name, o.amount
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE c.country = 'US';
MySQL output:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | SIMPLE | c | ref | PRIMARY,idx_country | idx_country | 12 | const | 32000 | 100.00 | NULL |
| 1 | SIMPLE | o | ref | idx_orders_customer | idx_orders_customer | 4 | mydb.c.id | 20 | 100.00 | NULL |
Key columns to focus on:
| Column | What to Look For |
|---|---|
| type | Access method. ALL = full table scan (bad). ref = index lookup (good). eq_ref = unique index lookup (best). range = index range scan. |
| key | Which index is actually used. NULL = no index used. |
| rows | Estimated rows examined. Lower is better. |
| filtered | Percentage of rows remaining after filtering. 100% = no rows discarded. 10% = 90% discarded (filter is selective). |
| Extra | Additional info. Using filesort = extra sort step. Using temporary = temp table created. Using index = index-only scan. |
Watch for these in the Extra column:
Using filesort: Data must be sorted separately from any index order. May be slow for large results.Using temporary: A temporary table is created (common withGROUP BYorDISTINCT). Can be slow.Using where: Rows are filtered after reading. Fine if thetypecolumn shows an index is used; bad iftypeisALL.
SQL Server Execution Plans
SQL Server provides graphical execution plans in SSMS (SQL Server Management Studio) and text-based plans via SET commands.
Generating a Plan
-- Text-based estimated plan
SET SHOWPLAN_TEXT ON;
GO
SELECT * FROM orders WHERE customer_id = 42;
GO
SET SHOWPLAN_TEXT OFF;
-- Detailed XML plan (includes actual statistics)
SET STATISTICS XML ON;
GO
SELECT * FROM orders WHERE customer_id = 42;
GO
SET STATISTICS XML OFF;
Key Operators to Watch
| Operator | Meaning | Good or Bad? |
|---|---|---|
| Clustered Index Seek | Navigates the clustered index directly | Good |
| Index Seek | Navigates a non-clustered index | Good |
| Clustered Index Scan | Reads the entire table | Bad (usually) |
| Table Scan | Full scan of a heap table | Bad (usually) |
| Key Lookup | Goes back to the clustered index to fetch missing columns | OK in small numbers, bad in large numbers |
| Sort | Sorts data in memory or on disk | Potentially expensive |
| Hash Match | Hash join or hash aggregate | Normal for larger datasets |
A Practical Debugging Workflow
Here is a step-by-step workflow for diagnosing a slow query using execution plans:
Step 1: Get the Actual Plan
EXPLAIN ANALYZE
SELECT c.name, COUNT(o.id) AS order_count, SUM(o.amount) AS total
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE c.country = 'US'
AND o.order_date >= '2024-01-01'
GROUP BY c.name
ORDER BY total DESC
LIMIT 20;
Step 2: Read Bottom-Up, Find the Slowest Node
Look at the actual time for each node. The node with the highest time difference between its start and end time, or the highest loops count, is your bottleneck.
Step 3: Check for Warning Signs
| Warning Sign | What It Means | Fix |
|---|---|---|
Seq Scan on a large table with a selective filter | Missing index | Create an index on the filtered column |
Rows Removed by Filter: 1,500,000 | Index not covering the filter condition | Add or adjust indexes |
Estimated rows=10, actual rows=500,000 | Stale statistics | Run ANALYZE |
Sort Method: external merge Disk: | Sort spilling to disk | Increase work_mem or add sorted index |
Nested Loop with loops=100,000 | Suboptimal join strategy | Update statistics, consider index changes |
Heap Fetches: 50,000 on an Index Only Scan | Visibility map outdated | Run VACUUM (PostgreSQL) |
Step 4: Make One Change and Re-Check
Make one change at a time (add an index, update statistics, rewrite a condition) and re-run EXPLAIN ANALYZE to measure the impact. If you change multiple things at once, you will not know which one helped.
-- Before: Seq Scan, 450ms
-- Change: Create index
CREATE INDEX idx_orders_date_status ON orders(order_date, status);
-- After: Index Scan, 12ms
EXPLAIN ANALYZE SELECT ...;
Common Mistakes to Avoid
Mistake 1: Only Looking at Total Execution Time
-- "It ran in 200ms, that's fine."
-- But EXPLAIN ANALYZE reveals:
Seq Scan on orders (cost=0.00..45218.00 rows=2000000 width=36)
(actual time=0.010..185.000 rows=2000000 loops=1)
200ms may seem acceptable, but the query is scanning 2 million rows to answer a simple question. With concurrent users, this will degrade fast. Always look at the plan structure, not just the total time.
Mistake 2: Ignoring Estimated vs Actual Row Differences
HashAggregate (cost=100.00..110.00 rows=50 width=20)
(actual time=5200.000..5800.000 rows=450000 loops=1)
Estimated 50 rows, actual 450,000. The optimizer chose a hash aggregate sized for 50 groups. With 450,000 groups, it overflowed to disk. Fix the statistics:
ANALYZE orders;
Mistake 3: Using EXPLAIN Without ANALYZE for Performance Tuning
EXPLAIN alone shows only estimates. The actual execution can be very different:
-- EXPLAIN shows estimated rows=100, cost=50
-- But the actual query processes 500,000 rows and takes 8 seconds
Always use EXPLAIN ANALYZE when tuning performance. The estimates are useful for understanding the optimizer's reasoning, but actual numbers tell you the truth.
Mistake 4: Not Running EXPLAIN After Creating an Index
-- Created an index
CREATE INDEX idx_orders_amount ON orders(amount);
-- Assumed it would be used, never checked
-- The query actually wraps amount in a function, bypassing the index
SELECT * FROM orders WHERE ROUND(amount, 0) = 250;
Always verify with EXPLAIN that the index is actually being used.
Quick Reference: What to Look For
- Prefer Index Only Scan > Index Scan > Bitmap Scan > Seq Scan
- Avoid Sequential Scans on large tables
- Rows Removed by Filter should be low
- Estimated rows ≈ Actual rows (within ~10×)
- Watch for large row estimate mismatches (run
ANALYZEif needed) - Sort Method:
quicksort(good) vsexternal merge(disk spill, bad) - Heap Fetches = 0 for true Index Only Scans
- Loops count should be low on inner nodes of Nested Loops
- Join strategy should match data size (Nested Loop / Hash Join / Merge Join)
- Identify the node with the highest Actual Total Time
- With
BUFFERS: prefer high shared hits, low shared reads
Summary
SQL EXPLAIN is your window into how the database actually processes your queries. It transforms performance tuning from guesswork into evidence-based optimization.
EXPLAINshows the estimated execution plan without running the query.EXPLAIN ANALYZEruns the query and shows actual statistics. Always preferEXPLAIN ANALYZEfor performance tuning.- Sequential scans read every row in a table. Index scans navigate a B-tree to jump directly to matching rows. Index-only scans are the fastest, reading only from the index.
- Plans are read bottom-up. Data flows from inner (child) nodes to outer (parent) nodes. The top node produces the final result.
- Bottleneck indicators: unexpected sequential scans, high
Rows Removed by Filter, large discrepancies between estimated and actual rows, disk-based sorts, and nested loops with high iteration counts. - Fix bottlenecks by creating targeted indexes, updating statistics (
ANALYZE), rewriting queries to avoid functions on indexed columns, and increasing memory settings for sort operations. - Make one change at a time and re-run
EXPLAIN ANALYZEto measure impact.
Make EXPLAIN ANALYZE a habit. Before you declare any query "done," run it through EXPLAIN ANALYZE at least once. The five seconds it takes can save you hours of debugging production performance issues later.