How to Create and Manage Indexes in SQL
Knowing what indexes are is one thing. Knowing how to create, configure, and manage them effectively is what actually makes your database fast. A well-chosen index can turn a query that takes 30 seconds into one that returns in 5 milliseconds. A poorly chosen index wastes disk space, slows down writes, and may never even get used by the query optimizer.
This guide is a hands-on, practical deep dive into SQL CREATE INDEX: how to create single-column and composite indexes, how unique indexes enforce data integrity, how to choose the right columns and column order, and how to safely drop indexes you no longer need. Every concept includes practical examples, database-specific syntax differences, and the reasoning behind each decision.
The Sample Data
We will use the following tables throughout this guide. Imagine these at production scale with millions of rows.
customers table:
| id | first_name | last_name | country | signup_date | is_active | |
|---|---|---|---|---|---|---|
| 1 | alice@example.com | Alice | Morgan | US | 2022-03-15 | true |
| 2 | bob@mail.co | Bob | Chen | UK | 2022-06-01 | true |
| 3 | charlie@inbox.net | Charlie | Garcia | US | 2023-01-20 | false |
| 4 | diana@webmail.org | Diana | Patel | CA | 2023-04-10 | true |
| 5 | eve@example.com | Eve | Morgan | US | 2023-08-05 | true |
| 6 | frank@mail.co | Frank | Smith | UK | 2024-01-14 | true |
CREATE TABLE customers (
id INT PRIMARY KEY,
email VARCHAR(100) NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
country CHAR(2) NOT NULL,
signup_date DATE NOT NULL,
is_active BOOLEAN NOT NULL
);
INSERT INTO customers (id, email, first_name, last_name, country, signup_date, is_active) VALUES
(1, 'alice@example.com', 'Alice', 'Morgan', 'US', '2022-03-15', true),
(2, 'bob@mail.co', 'Bob', 'Chen', 'UK', '2022-06-01', true),
(3, 'charlie@inbox.net', 'Charlie', 'Garcia', 'US', '2023-01-20', false),
(4, 'diana@webmail.org', 'Diana', 'Patel', 'CA', '2023-04-10', true),
(5, 'eve@example.com', 'Eve', 'Morgan', 'US', '2023-08-05', true),
(6, 'frank@mail.co', 'Frank', 'Smith', 'UK', '2024-01-14', true);
orders table:
| id | customer_id | order_date | amount | status | shipping_method |
|---|---|---|---|---|---|
| 1 | 1 | 2024-01-05 | 250.00 | completed | standard |
| 2 | 2 | 2024-01-12 | 430.00 | completed | express |
| 3 | 1 | 2024-01-20 | 180.50 | pending | standard |
| 4 | 3 | 2024-02-08 | 520.00 | completed | express |
| 5 | 2 | 2024-02-15 | 310.00 | cancelled | standard |
| 6 | 1 | 2024-03-01 | 275.00 | completed | express |
| 7 | 4 | 2024-03-18 | 640.00 | completed | overnight |
| 8 | 5 | 2024-04-02 | 190.00 | pending | standard |
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
amount DECIMAL(12,2) NOT NULL,
status VARCHAR(20) NOT NULL,
shipping_method VARCHAR(20) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
);
INSERT INTO orders (id, customer_id, order_date, amount, status, shipping_method) VALUES
(1, 1, '2024-01-05', 250.00, 'completed', 'standard'),
(2, 2, '2024-01-12', 430.00, 'completed', 'express'),
(3, 1, '2024-01-20', 180.50, 'pending', 'standard'),
(4, 3, '2024-02-08', 520.00, 'completed', 'express'),
(5, 2, '2024-02-15', 310.00, 'cancelled', 'standard'),
(6, 1, '2024-03-01', 275.00, 'completed', 'express'),
(7, 4, '2024-03-18', 640.00, 'completed', 'overnight'),
(8, 5, '2024-04-02', 190.00, 'pending', 'standard');
products table:
| id | sku | name | category | price |
|---|---|---|---|---|
| 1 | WDG-001 | Widget Pro | Electronics | 79.99 |
| 2 | GDG-002 | Gadget Max | Electronics | 149.99 |
| 3 | CHR-003 | Office Chair | Furniture | 299.00 |
| 4 | LMP-004 | Desk Lamp | Furniture | 45.00 |
| 5 | WDG-005 | Widget Basic | Electronics | 29.99 |
CREATE TABLE products (
id INT PRIMARY KEY,
sku VARCHAR(20) NOT NULL,
name VARCHAR(100) NOT NULL,
category VARCHAR(50) NOT NULL,
price DECIMAL(10,2) NOT NULL
);
INSERT INTO products (id, sku, name, category, price) VALUES
(1, 'WDG-001', 'Widget Pro', 'Electronics', 79.99),
(2, 'GDG-002', 'Gadget Max', 'Electronics', 149.99),
(3, 'CHR-003', 'Office Chair', 'Furniture', 299.00),
(4, 'LMP-004', 'Desk Lamp', 'Furniture', 45.00),
(5, 'WDG-005', 'Widget Basic', 'Electronics', 29.99);
CREATE INDEX
The CREATE INDEX statement builds a new index on one or more columns of an existing table. Once created, the database automatically maintains the index as data is inserted, updated, or deleted.
Basic Syntax
CREATE INDEX index_name
ON table_name (column_name);
| Component | Description |
|---|---|
index_name | A name you choose for the index. Should be descriptive. |
table_name | The table to index |
column_name | The column (or columns) to include in the index |
Your First Index
Suppose your application frequently looks up customers by email:
SELECT * FROM customers WHERE email = 'alice@example.com';
Without an index, the database scans every row. Let's create one:
CREATE INDEX idx_customers_email
ON customers (email);
Now the database can locate the matching row through the index's B-tree structure instead of scanning the entire table.
Verifying the Index Was Created
-- PostgreSQL
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'customers';
-- MySQL
SHOW INDEX FROM customers;
-- SQL Server
EXEC sp_helpindex 'customers';
-- SQLite
.indexes customers
PostgreSQL output:
| indexname | indexdef |
|---|---|
| customers_pkey | CREATE UNIQUE INDEX customers_pkey ON public.customers USING btree (id) |
| idx_customers_email | CREATE INDEX idx_customers_email ON public.customers USING btree (email) |
The primary key index (customers_pkey) was created automatically. Our new idx_customers_email appears alongside it.
Verifying the Index Is Being Used
Creating an index does not guarantee the optimizer will use it. Always verify with EXPLAIN:
-- PostgreSQL
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';
Output (index is used):
Index Scan using idx_customers_email on customers (cost=0.14..8.16 rows=1 width=85)
Index Cond: (email = 'alice@example.com'::text)
Output (index is NOT used, full scan instead):
Seq Scan on customers (cost=0.00..12.50 rows=1 width=85)
Filter: (email = 'alice@example.com'::text)
If you see Seq Scan (PostgreSQL), type: ALL (MySQL), or Table Scan (SQL Server), the index is not being used and you should investigate why.
Use EXPLAIN ANALYZE in PostgreSQL to see actual execution time, not just the plan:
EXPLAIN ANALYZE SELECT * FROM customers WHERE email = 'alice@example.com';
This runs the query and shows real timing data, which is invaluable for measuring the actual impact of your index.